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Abstract 


Group signature schemes allow a group member to sign messages on behalf of the group. Such signatures 
must be anonymous and unlinkable but, whenever needed, a designated group manager can reveal the 
identity of the signer. During the last decade group signatures have been playing an important role in 
cryptographic research; many solutions have been proposed and some of them are quite efficient, with 
constant size of signatures and keys ([1], [6], [7] and [15]). However, some problems still remain among 
which the large number of computations during the signature protocol and the difficulty to achieve 


coalition-resistance and to deal with member revocation. 


In this paper we investigate the use of a 


tamper-resistant device (typically a smart card) to efficiently solve those problems. 


1 Introduction 


In 1991, D. Chaum and E. van Heijst [8] introduced 
the concept of group signature schemes. A group 
signature scheme allows members to sign a docu- 
ment on behalf of the group in such a way that sig- 
natures remain anonymous and unlinkable for ev- 
erybody but a group manager (GM), who can re- 
cover the identity of the signer whenever needed 
(the latter procedure is called “signature opening” ). 
Numerous group signature schemes have been pub- 
lished and some of them are quite efficient ((1], [6], 
[7] and {15]). In more recent ones, signatures and 
public keys are constant-size and security is well es- 
tablished, allowing them to be used in various ap- 
plications such as electronic cash ([15]), voting or 
bidding systems ({12]). However some problems still 
remain among which the high computation cost of 
the signature, the coalition-resistance and member 
revocation. 

In this paper, we investigate a completely different 
approach for carrying out group signature schemes, 
namely the usage of a tamper-resistant device - typ- 
ically a smart card. This allows a very low cost dur- 
ing the signature phase. In fact, the signer only has 
to compute two or three modular exponentiations 
(in contrast with roughly a dozen in the scheme from 
{1] for example). Moreover, the coalition-resistance 


problem is very easy to solve when using smart cards 
and more simple procedures can be used for mem- 
ber revocation. 

The use of a smart card allows to prevent an 
(untrusted) member from cheating, by letting his 
(trusted) device both secretly store the signature 
keys and control their legitimate usage. Using smart 
cards allows to provide solutions for member revo- 
cation that are generic (i.e. work with any group 
signature scheme) and efficient, in that the signa- 
tures are short and constant-size, and the number 
of computations (for the signer and the verifier) is 
constant. Moreover the work during the revocation 
protocol is constant. Since smart cards are more and 
more used in real-life applications, our solutions can 
be implemented at a negligible extra-cost. 

This paper is organized as follows. The follow- 
ing section provides background on group signa- 
ture schemes and points remaining problems out. 
Section 3 presents our group signature scheme 
and shows that it is coalition-resistant. Section 4 
presents various solutions for providing member re- 
vocation. Finally, we conclude in section 5. 
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2 Group Signature Schemes 


This section presents the state of the art in the 
group signature area. It briefly introduces the se- 
curity properties and then the related works. 


2.1 Definition 


Definition 1. A group signature scheme is a signa- 
ture scheme which satisfies the following properties: 
(i) Correctness: a signature produced by a group 
member is always valid. 

(ii) Unforgeability: only group members are able to 
sign messages on behalf of the group. 

(iii) Anonymity: given a valid group signature, it 
is infeasible for everyone but the group manager to 
identify the actual signer. 

(iv) Unlinkability: deciding whether two different 
valid signatures were computed by the same group 
member is infeasible. 

(v) Exculpability: neither a group member nor the 
group manager can sign on behalf of other group 
members. 

(vi) Traceability: the group manager is always able 
to open a valid signature, i.e. to identify the actual 
signer. 

(vii) Coalition-Resistance: a colluding subset of 
group members should not be able to generate a valid 
signature that the group manager cannot link to one 
of the colluding group members. 


2.2 Related Works: 
Schemes 


Group Signature 


Since the paper of Camenisch and Stadler [7], the 
same method has always been used to set a group 
signature scheme up. It is based on a difficult prob- 
lem implying two or more values. Alice is a member 
of the group if and only if she knows a solution of 
this difficult problem. 

If Alice wants to become a group member, she in- 
teracts with GM (who holds a secret key) in order 
to obtain in a blind manner her private key and 
her membership certificate. This latter value allows 
GM to establish the link between a signature and a 
group member. 

During the signature protocol, Alice encrypts her 
membership certificate, then “proves” that she 
knows a solution of the difficult problem and that 
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she has correctly encrypted her certificate. As a 
consequence, this protocol involves numerous mod- 
ular exponentiations. Someone who wants to verify 
the signature only has to verify the whole proof, 
also known as a signature of knowledge. The group 
manager can open the signature by decrypting Al- 
ice’s certificate. 


Coalition-resistance has often be defeated ({7]) and 
was an unsolved problem until [1] and [6]. In these 
two articles, the authors propose new group signa- 
ture schemes based on the strong RSA assumption 
([3] and [9]) and prove that they are resistant to 
coalitions. 


2.3 Related Works: Member Revoca- 


tion 


At any time a member can decide to leave the group. 
In this case, we can reasonably think that he will not 
try to cheat in the future, but it is far from sure. 
Furthermore if a member is revoked from the group 
against his will, it is very plausible that he will try 
to keep on signing even if he has not the right to 
anymore. In both cases, it is necessary to set up a 
mechanism which prevents this type of fraud. 

The paper of E. Bresson and J. Stern [4] proposed 
the most intuitive solution which consists for the 
signer in proving that he is different from any re- 
voked member. But this method obviously gener- 
ates a signature whose size linearly increases accord- 
ing to the number of revoked members. 

In a recent paper, Song [14] proposed two revoca- 
tion methods that are relatively similar and provide 
constant-length signatures and a constant work for 
the group manager. But the work of the verifier 
is also linear in the number of revoked members. 
Moreover, the solution is not very practical since it 
deals with a group with a limited life-expectancy. 
Ateniese, Song and Tsudik [2] proposed a modifi- 
cation of the Ateniese et al. scheme [1] to improve 
member revocation, which also provides a constant 
size of signature. But works during the revocation 
phase and the verification one are linear in the num- 
ber of revoked members. Finally, the cost of the sig- 
nature is very expensive and consequently it is an 
overall unpractical solution. 

Very recently, Camenisch and Lysyanskaya [5] pro- 
posed the first practical method for member revo- 
cation. It is also based on the scheme of Ateniese et 
al. [1] and therefore is not really generic (i.e. can- 
not be easily applied to any other group signature 
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scheme). Moreover the signer has to make (possi- 
bly off-line) a number of modular exponentiations 
which is proportional to the number of modifica- 
tions in the group (addition or deletion) until his 
last signature. Finally, this solution implies addi- 
tional proofs of knowledge and, consequently, many 
other modular exponentiations. 


3 Group Signature Schemes and 


Smart Cards 


In this paper, we propose to build a group signa- 
ture scheme relying on (typically) a smart card. It 
enables us to obtain straightforwardly the integrity 
of the (public or secret) data and of the program 
implemented in this tamper-resistant device. More- 
over the confidentiality of keys and data is in the 
same way easily well-preserved. As a consequence, 
a solution simpler than previously proposed ones ([1] 
or [6]) can be introduced. 


3.1 Shared Private Key and Smart 
Card 


Our solution consists in using a smart card and 
a group-shared private key. First of all, we must 
choose an ordinary signature scheme (keys SK and 
PKg) anda semantically secure cry ptosystem (keys 
Daut and Eyyt), which is a cryptosystem where 
the ciphertext does not leak any partial informa- 
tion whatsoever about the plaintext that can be 
computed in expected polynomial time (and con- 
sequently, it is a probabilist cryptosystem). Then, 
the group manager computes keys in such a way that 
he can keep secret private ones (Dut) or distribute 
them (Sg) to members without knowing them (for 
example, several group managers can share a dis- 
crete logarithm as the private key). He publishes 
public keys (PKG and Eaut). 

If Alice wants to become a new group member, she 
firstly has to hold a smart card. Then, she has 
to obtain from the group manager an identifier z 
(which is unique and that identifies her) and the 
shared private key SK (which is common to all 
group members). Alice’s smart card also has ac- 
cess to all parameters so as to use the cryptosystem 
(among which E'4uz) and the signature scheme de- 
fined above. The group manager has to keep in mind 
the link between the identifier (i.e. z) and the iden- 


tity of the group member (i.e. Alice). 

When Alice wants to sign a message as a group 
member (see Figure 1), she has to use her smart 
card. First, the identifier z is encrypted (algorithm 
EA) with the group manager’s public key E'aut (so 
that the group manager is the only one who can de- 
crypt). Then the message M is concatenated with 
this encrypted value C' and the whole is signed with 
the help of (algorithm SA and) the shared private 
key SKg. As a consequence, only group members 
can sign a message and everybody is able to verify 
the signature with the associated public key PK. 


M = Message 

|| = Concatenation algorithm 

z = Member’s identifier 

M’ = Concatenation of M and C 

EA = Encryption algorithm 

SA = Signature algorithm 

Eaut = GM’s encryption key 

Sco = Signature of the message 

C = Encryption of the identifier 

SKg = Group-shared signature private key 





Figure 1: Shared Private Key and Smart Card 


The verifier obtains the encrypted value C,, the mes- 
sage M, and the signature Sg of the whole. He only 
has to verify the signature to be sure that the mes- 
sage is sent by a group member (because only group 
members possess the group-shared private key used 
to compute the signature). The group manager can 
open the signature by decrypting the identifier (with 
the key Daut).- 

It is important to note that the encryption scheme 
can either be symmetric or asymmetric. Neverthe- 
less, it must be probabilist. On the contrary, it is 
necessary to use an (asymmetric) signature scheme 
for obvious reasons. 

This approach makes possible a very fast signature, 
since there is only one encryption and one ordinary 
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signature to compute. Consequently, our solution is 
much better than previous ones in terms of speed 
and memory and in terms of genericity (any signa- 
ture scheme can be employed). 

Furthermore, it can be used in an on-line/off-line 
manner as follows : first of all, the card precom- 
putes several encrypted values C in an off-line phase. 
Then, by using an on-line/off-line signature scheme 
SA, the card can precompute some values in an off- 
line phase, and later (in the on-line phase) produce 
group signatures very quickly, for example by doing 
a single multiplication if using the algorithm known 
as GPS ([10] and [{13)]). 


3.2. Coalition-Resistance 


The problem of coalition-resistance is easily solved 
when using tamper-resistant devices. In fact, it is 
impossible for two members to create a new card be- 
cause they cannot access to protected data. In par- 
ticular, they have no knowledge about the group- 
shared secret key SK (only their cards have). 
Moreover, producing a signature without knowing 
the private key violates the security assumption of 
the underlying signature scheme. 


3.3. Security Arguments 


Theorem 1. Under the assumption that a smart 
card is tamper-resistant, the group signature scheme 
proposed in section 3.1 is secure. 


Proof. (sketch of) 

We have to show that our scheme satisfies all the 
security properties that are listed in Definition 1. 
(i) Correctness: by construction. 

(ii) Unforgeability: only group members can have 
the private group-shared key in their smart card 
(due to their interaction with the group manager) 
and consequently can sign on behalf of the group. 
(iii) Anonymity: everybody has the same private 
signature key and the identifier of the signer is en- 
crypted. As aconsequence, a verifier cannot identify 
the signer because each group member can poten- 
tially compute the same signature and he cannot 
learn anything from the encrypted value (see seman- 
tically secure cryptosystem). 

(iv) Unlinkability: group members have a shared 
key and the cryptosystem is semantically secure. It 
is then infeasible to link two different signatures. 


(v) Exculpability: this is due to the fact that the 
identifier of a signer is embedded in his group sig- 
nature and that the smart card is tamper-resistant. 
Moreover, this property is ensured w.r.t the group 
manager since he doesn’t know the group-shared key 
(see the first paragraph of section 3.1). 

(vi) Traceability: the card always encrypts the iden- 
tifier of the group member. As a consequence, the 
group manager can always decrypt it and then open 
the signature. 

(vii) Coalition-Resistance: see the remark in section 
302. o 


4 Revocation 
Schemes 


in Group Signature 


We suggest two approaches for dealing with member 
revocation. The first one is based on a group-shared 
private key and, as in section 3, relies on the confi- 
dentiality of this key (even w.r.t. the card-holder). 
The second one is based on “black lists” and relies 
on the integrity of the “black list” membership pro- 
gram executed by the card. 


4.1 First Approach 


4.1.1 General Principle. 


Our approach consists in generating an additional 
signature computed with a group-shared private key 
SKg. We denote by PK the associated public 
key. SK is communicated by the group manager to 
each non revoked member, by the means of a group 
key distribution scheme (for example [16]). As a 
consequence, the revocation problem is reduced to 
a group key distribution problem, for which solu- 
tions already exist. Moreover, it happens that, in 
our case, these solutions are easier to use. 

When a new member wants to integrate the group, 
the group manager securely sends him, among other 
elements, the group-shared key Skg. And when 
a member is revoked, the group manager sets up 
a mechanism of member revocation, which implies 
the renewal of the group-shared key. It is impossi- 
ble for the revoked member to learn anything about 
the new shared key and consequently he cannot sign 
anymore. The group manager has to publish data 
in order to make possible for other members to get 
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the new key. 

After that, if a member wants to sign a message Mf 
on behalf of the group (see Figure 2), he computes 
his group signature as usual (using [1], [6] or the so- 
lution described in section 3 for example) to obtain 
a couple (M, Sq) which he is going to sign by means 
of SKg. The receiver can then verify the latter sig- 
nature with Pig and the value Sg as a signature 
of a group member. 


M = Message 

M' = Concatenation of M and Sg 

Kg = Group (private/secret) key(s) 

SKog = Group-shared signature private key 
GSA = Group signature algorithm 

SA = Signature algorithm 

Sc = M’s group signature 

S = Signature of the message 

|| = Concatenation algorithm 


Figure 2: First Approach - Signature Protocol 


4.1.2 Group Key Distribution. 


The most simple solution to manage group key dis- 
tribution for our proposal is to share a secret key 
with each group member and to encrypt the new 
group-shared key with each secret key. Each valid 
member can decrypt one of the encrypted values to 
obtain the new group-shared key. 

The identifier of the group member can be appended 
to each encrypted value. The group member only 
has to test if it is his own identifier and to decrypt 
the corresponding value if it is the case (see Fig- 
ure 3). 





Receive an 
element 


2IE(SK,) 





Figure 3: First Approach - Getting the Key 


There are some other solutions in the literature that 
are more interesting than this simple one. For ex- 
ample, Wong et al. [16] propose a solution based 
on a tree, where each leaf corresponds to a group 
member and where each node corresponds to a se- 
cret key. Each group member shares with the group 
manager all keys that are in the path between their 
leaf and the root. As every member knows the key 
root, this latter is chosen as the group-shared key. 
Consequently, for a particular revocation phase, the 
GM only has a limited number of values to encrypt, 
instead of many in the naive method. 


4.1.3 Security and Efficiency Considera- 
tions. 


There is no way for the revoked member to learn 
anything about the new group-shared key. Then, 
the key contained in his smart card is no longer 
valid. As a consequence, the second signature will 
never be correct anymore. Finally, the group man- 
ager can efficiently and securely revoke group mem- 
bers. 

The size of the signature is constant and the group 
signature is only increased by a single classical sig- 
nature. Moreover, this method can be applied to 
any group signature scheme (including the one of 
section 3) and there is no extra work for the verifier 
(the cost is constant). The revocation protocol de- 
pends on the group key distribution scheme which 
is used. In particular, its cost will be at most linear 
in the number of group members. 


4.1.4 Shared Private Key and Smart Card : 
Dynamic Case. 


Section 3 presents a new group signature scheme 
based on a shared secret key and a smart card. Sec- 
tion 4.1 presents a solution to the problem of revo- 
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cation that adds to the general group signature an 
ordinary signature that depends on a group-shared 
key. If one wants to apply this revocation method to 
this group signature, each signer will have a priori to 
compute two different signatures. But the two sig- 
natures can easily be merged into a single one, since 
they both use a group-shared secret key. This leads 
to a very attractive method which allows revocation 
while generating only one signature. More precisely, 
only one signature is necessary because it is possi- 
ble to replace the (fixed) group-shared key of section 
3 with a dynamic group-shared key, as explained in 
section 4.1. The group-shared key used in the group 
signature scheme only needs to be modified by the 
group manager after each revocation (see. section 
4.1.2) and the rest is unchanged. Figure 1 shows 
the mechanism carried out by the smart card dur- 
ing the signature phase to which must be added the 
key updating phase illustrated in Figure 3. 


4.2 Second Approach 


4.2.1 General Principle. 


Generally speaking, the simplest idea to deal with 
revocation problem is to maintain a revocation list 
(or a “black list”). The signer reveals a personal 
value and the verifier is then able to say, by match- 
ing the received value against each entry of the 
“black list”, if the person is revoked or not. Un- 
fortunately, in the context of group signatures, it 
is not possible to reveal a personal value since it 
would compromise the anonymity of the signer. Us- 
ing a smart card allows to give a simple solution to 
this problem. Figure 4 shows the general principle 
of this approach. 









Sign the message 
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/™ 
/ \ 
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Figure 4: Second Approach - General Principle 
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In a few words, each member owning a personal 
value (an identifier), the smart card will get the re- 
vocation list from the group manager database (or 
any database where the “black list” stands, e.g. the 
verifier device) and will check if one value of the list 
and its personal value match. If the card reaches 
the end of the list, it will accept to sign as a group 
member; and if its personal value lies in the list, 
then the card will refuse to sign and make itself out 
of order. 


4.2.2 First Solution. 


Description. The first solution is straightforward 
and Figure 5 shows its principle. It consists in hav- 
ing the whole “black list” signed by the GM. Assum- 
ing that the underlying hash function of the signa- 
ture scheme is iterative (most of them are so), it is 
possible for the smart card to verify the signature of 
a large message without needing to keep the entire 
message in his memory. 


Initialize the 
witness to | 
oN 


Sign the message 





Figure 5: Second Approach - First Solution 


Note that it is possible to use this method in a con- 
text of “white list” (that is a list which contains the 
identifiers of all members). In this case the card ac- 
cepts to sign only if its identifier is in the list. It can 
be useful if the group has few members but a lot of 
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revocations. We do not treat this case in this paper 
as it is an easy adaptation of the “black list” case. 


Security. The mechanism is secure under the as- 
sumption that the card is tamper-resistant. In fact, 
an attacker who wants to add some more values in 
the revocation list cannot do it because he cannot 
falsify the group manager signature. Then, it is im- 
possible to substitute a value for another one be- 
cause the signature would then be incorrect. More- 
over removing a value from the revocation list would 
generate a card error because the final test on the 
signature verification would be wrong. Finally, re- 
playing indefinitely the same revocation list would 
imply the rejection of the signature by the verifier 
because he could compare the date of the updating 
by GM (Dem) with the date of the last signature 
by the smart card (Dg). In fact, if De is different 
from Dgaz he can think that the signer has intended 
to cheat. For example the revocation list can be up- 
dated every day. Another solution is the use of an 
on-line verification (even if it is an “extreme” case). 
We can then conclude that the previous mechanism 
is secure under the assumption that the card is se- 
cure. 


Efficiency Considerations. This is a generic so- 
lution with a constant size of signature. In fact, the 
size of the signature is the same as that of the un- 
derlying signature scheme. From a computational 
point of view, there is a number of equality tests 
that is proportional to the number of revoked mem- 
bers, which can be considered as negligible, and the 
verification of only one signature. Another advan- 
tage of this solution is that the verifier does not 
have any extra computation to do. His work is no 
greater than that of the verifier in the underlying 
signature scheme. The work during the revocation 
phase is also constant. The group manager only has 
to add a value in the revocation list and to modify 
the resulting signature. 


4.2.3 Second Solution. 


Description. The second solution is also straight- 
forward (see Figure 6). It consists in sending to 
the card all elements of the “black list” one by one, 
each of them signed by the group manager. It is yet 
necessary to add a revocation number (a sequence 
number: number 1 corresponds to the first revoked 


member, etc.) to prevent some attacks (for exam- 
ple addition or substitution of some identifiers). In 
addition, GM signs the date of his updating of the 
“black list” Dgay and the number of revoked mem- 
bers. 
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witness to 1 
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counter to 0 
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Figure 6: Second Approach - Second Solution 


Security. The mechanism is secure under the as- 
sumption that the card is tamper-resistant. In fact 
an attacker cannot add some more values in the re- 
vocation list because he cannot afterwards compute 
the related signature. He cannot substitute a value 
for another one because the corresponding signature 
would then be incorrect. Removing a value from 
the revocation list would generate a card error be- 
cause the final test on the signed number of revoked 
members would be wrong. Finally, as for the first 
solution (see section 4.2.2), there is no way to replay 
indefinitely the same list. 
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Efficiency Considerations. This is a generic so- 
lution with a constant size of signature. Once again, 
the size of the signature is the same as that of the 
underlying signature scheme. However, the signer 
has to check the validity of GM signatures for each 
revoked member which makes his work linear in the 
number of revoked members. The work of the group 
manager is constant-size since he only has to add a 
new value and to compute two signatures at each re- 
vocation. The verifier also has a constant-size work. 
Note that this method can also be used in a context 
of “white list”. 


An Improvement. At first glance, this solution 
seems to be less attractive than the first one. In- 
deed, the number of signatures to be verified is large 
if there are many revoked members. But a modifi- 
cation can be done so as to improve it. 

Actually, we can argue that nobody can see nor 
modify the data exchanged between the smart card 
and the card reader. This is a plausible assumption 
if we consider that each member of the group has 
got a personal card reader that is always linked to 
his proper computer. 

Therefore we can improve the solution by putting 
on a new value in the smart card memory that cor- 
responds to the number of values that the card has 
already verified in the group manager database. In- 
deed, the card does not need to test twice the same 
values. Consequently, it can inform the card reader 
of the number of values it has already tested and 
as a consequence the card reader will only send to 
the card the new values since the last signature of 
that card (plus the signature of the updating date 
and of the number of revoked members). As a re- 
sult, the card will only have a limited number of GM 
signatures to verify befere producing a signature. 


4.2.4 Third Solution. 


A variant of the first solution consists in replacing 
the “black list” by a much shorter digest, so that the 
verification step becomes in average much faster. If 
the output of this step is “no”, then we are sure 
that the member is not revoked and the card ac- 
cepts to sign. However, if the output is “yes” then 
we cannot definitely conclude and the whole “black 
list” should be requested for a complete verification . 
We now briefly describe in the following subsection 


a possible way of achieving a compression of this 
kind. 


An Example of Representation. The mecha- 
nism named “Superimposed coding” [11] allows to 
store a set of data of variable size into a bit-string 
of fixed size. It is then possible, with a simple test, 
to estimate the probability that an element is in the 
set of data (which depends on the size of the result 
bit-string and on the number of data). This proba- 
bility is equal to 0 if the output of the test is “no”. 
More precisely, the result is an m-bit string named 
B. We note B = b,,_,...b,b9 where each bi € 
{0,1}. Initially, B is set to 00...0. We have then 
k elements y;,...,yz Of various size and we note 
the set of data Y = {y1,---, ye}. Moreover, let 
us define q hash functions hj,...,hg where each 
hy : {0,1}* — {0,1}* with m = 2°. 

For j = 1..k we compute h(y;),...,hg(yj) and for 
every ! = 1..q we put to 1 the bit b; wherez = h,(y;). 
To know if the element yr is in the set of data 
Y = {y,---,Yg}, we compute for every | = l..g 
Y; = hi(yp) and if there is an element lp € {1,...,q} 
such as by,, = Othen yr ¢ Y. If not, thenyr Ee Y 
with an error probability of about (a _ enn)’ 


Description. The group manager uses the “Su- 
perimposed coding” to transform the set of all per- 
sonal keys of each revoked member into the m-bit 
string B. Then he signs the latter value. A smart 
card is going to receive this signed bit-string, then 
treats it so as to verify the signature and to learn if 
its holder is revoked or not. 

According to the size of the group and more particu- 
larly to the number of revoked members, the size of 
the result bit-string and the number of packets will 
vary in order to obtain good trade-offs (negligible 
error probability and m of reasonable size). For ex- 
ample, for g = 8 and k = 10000 (i.e. at most 10000 
revoked members), the error probability is 2.3 x 10-5 
for a result bit-string of size 2!® (i.e. 32 Kbytes). 


Efficiency Considerations. This method is very 
interesting as the size of the signature and the num- 
ber of computations remains constant and the re- 
sulting scheme is completely generic. Moreover, the 
size of verification work is constant. During the 
revocation protocol, computations are very simple 
and relatively independent from the number of re- 
voked members, as the revocation manager only has 
to modify the resulting chain and has to compute 
the new linked signature. The only drawback is the 
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probability of mistake, but since it can be made neg- 
ligible, this third solution seems to be the more at- 
tractive one. 


5 Conclusion 


We have introduced a new way of designing group 
signature schemes by using a tamper-resistant de- 
vice (as a smart card). First we showed how to 
build a (coalition-resistant) group signature scheme 
starting from any (ordinary) signature scheme and 
any (semantically secure) encryption scheme. Such 
group signatures can be computed very efficiently 
(typically only one or two exponentiation(s)) and 
are constant-size. Then we addressed the member 
revocation problem and solved it by using two ap- 
proaches: in the first one, the group signature is 
completed with a signature involving a group-shared 
key which is renewed at each revocation; in the sec- 
ond one, the card checks it does not lie in a “black 
list” before computing a group signature. As a re- 
sult, smart cards allow to design group signature 
schemes which are simple, generic, efficient and se- 
cure at the same time. 


Acknowledgments 


We are very grateful to David Arditti and Jacques 
Traoré for numerous discussions and comments. We 
would also like to thank the anonymous referees for 
their useful remarks. 


References 


[1] G. Ateniese, J. Camenisch, M. Joye, G. 
Tsudik. A Practical and Provably Secure 
Coalition-Resistant Group Signature Scheme. 
In L. Bellare, editor, Advances in Cryptology- 
Crypto’2000, volume 1880 of LNCS, pages 255- 
270. Springer-Verlag, 2000. 


G. Ateniese, D. Song and G. Tsudik. Quasi- 
Efficient Revocation of Group Signatures. In 
Financial Cryptography 2002, Southampton, 
Bermuda, March 11-14, 2002. 


[2 


[3] N. Barié, B. Pfitzmann. Collision-Free Ac- 
cumulators and Fail-Stop Signature Schemes 
Without Trees. In W. Fumy editor, Advances 
in Cryptology-Eurocrypt’97, volume 1233 of 
LNCS, pages 480-484. Springer-Verlag, 1997. 

[4] E. Bresson, J. Stern. Efficient Revocation in 

Group Signatures. In K. Kim, editor, Public Key 

Cryptography-PKC2001, volume 1992 of LNCS, 

pages 190-206. Springer-Verlag, 2001. 


[5] J. Camenisch, A. Lysyanskaya. Efficient Revoca- 
tion of Anonymous Group Membership Certifi- 
cates and Anonymous Credentials. Crypto’2002, 


to appear. 


[6 


J. Camenisch, M. Michels. A Group Signature 
Scheme based on an RSA-variant. Technical Re- 
port RS-98-27, BRICS, Dept. of Comp. Sci., 
University of Arhus, preliminary version in Ad- 
vances in Cryptology-EUROCRYPT’98, volume 
1514 of LNCS. 


J. Camenisch, M. Stadler. Efficient Group Sig- 
nature Schemes for Large Groups. In B. Kaliski, 
editor, Advances in Cryptology-CRYPTO’97, 
volume 1296 of LNCS, pages 410-424. Springer- 
Verlag, 1997. 


[7 


[8 


D. Chaum, E. van Heyst. Group Signatures. In 
D. W. Davies, editor, Advances in Cryptology- 
Eurocrypt’91, volume 547 of LNCS, pages 257- 
265. Springer-Verlag, 1991. 

[9] E. Fujisaki, T. Okamoto. Statistical Zero- 
Knowledge Protocols Solution to Identification 
and Signature Problems. In A.M. Odlyzko, edi- 
tor, Advances in Cryptology-Crypto’97, volume 
1294 of LNCS, pages 16-30. Springer-Verlag, 
1997. 


[10] M. Girault. SelfCertified Public Keys. In 
D.W. Davies, editor, Advances in Cryptology- 
Eurocrypt’91, volume 547 of LNCS, pages 490- 
497. Springer-Verlag, 1991. 


[11] D. E. Knuth. The Art of Computer Pro- 
gramming, Volume 3 / Sorting and Searching. 
Addisson-Wesley Publishing Compagny. pages 
559-563. 1973. 


[12] K.Q. Nguyen, J. Traoré. An Online Public 
Auction Protocol Protecting Bidder Privacy. 
Information Security and Privacy, 5th Aus- 
tralasian Conference-ACISP 2000, pages 427- 
442. Springer-Verlag, 2000. 


CARDIS 02: 5!» Smart Card Research & Advanced Application Conference 


[13] G. Poupard, J. Stern. A Practical and Provably 
Secure Design for “on the Fly” Authentication 
and Signature Generation. In K. Nyberg, editor, 
Advances in Cryptology-Eurocrypt’98, volume 
1403 of LNCS, pages 422-436. Springer-Verlag, 
1998. 


[14] D. Song. Practical Forward Secure Group Sig- 
nature Schemes. ACM on Computer and Com- 
munications Security. 2001. 


[15] J. Traoré. Group Signatures and Their Rele- 
vance to Privacy-Protecting Off-Line Electronic 


Cash Systems. In J. Pieprzyk, R. Safavi-Naini, 
J. Seberry, editors, Information Security and 
Privacy, 4th Australasian Conference-ACISP’99, 
volume 1587 of LNCS, pages 228-243. Springer- 
Verlag, 1999. 


[16] C. K. Wong, M. G. Gouda, S. S. Lam. Se- 


cure Group Communications Using Key Graph. 
Technical Report TR-97-23, July 28, 1997, re- 
vised version in IEEE/ACM Transactions on 
Networking, Feb 2000. 





CARDIS 02: 5! smart Card Research & Advanced Application Conference 


USENIX Association 


Smart Cards in Interaction: 
Towards Trustworthy Digital Signatures 


Roger Kilian-Kehr 


Joachim Posegga 


SAP AG Corporate Research, CEC Karlsruhe 
Vincenz-Priessnitz-Str. 1, D-76131 Karlsruhe, Germany 


{ rogerkilian-kehr;, joachim.posegga} @sap.com 


Abstract 


We present approaches to raise the security level in 
the process of electronic signature creation by shifting 
as many tasks as possible involved in digitally signing 
data into a tamper-resistant and trustworthy smart card. 
We describe the fundamental technical principles our ap- 
proach is based on, illustrate resulting design options, 
and compare the security of our approach with tradi- 
tional electronic signature scenarios. 


Keywords: electronic signatures, smart cards. 


1 Introduction 


The cryptographic underpinnings of electronic signa- 
tures such as mathematical one-way functions or public 
key cryptography are well understood, and practically se- 
cure algorithms and key lengths are widely established. 
From this perspective, electronically signing documents 
is a straightforward undertaking. 

The actual procedure for digitally signing a document 
or a transaction, however, is a complex scenario in prac- 
tice which involves numerous issues beyond cryptogra- 
phy: Since a person who wants to create a digital sig- 
nature will usually not carry out the relevant computa- 
tion by herself, she needs to delegate this to some appli- 
cation running on a platform (device) that can perform 
such computations. The security level of the overall sig- 
nature creation process therefore depends on the security 
of several other, non-cryptographic factors, e.g. the en- 
vironment where the document/data presentation takes 
place, the security of the communication channel to a 
user, or the security properties of the environment where 
the cryptographic computations are carried out. 

Consider a scenario where a user signs a document 
displayed in a Web browser on a PC; at best, this involves 
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a smart card, where the signing key is stored, and the 
cryptographic algorithm to encrypt the document hash 
(and other relevant data) is executed within the card. An 
attacker who wants to trick the user into signing a fake 
document would likely not attack the smart card, but the 
environment within it is used (i.e.: the OS, the driver of 
the smart card reader, the signing application, the Web 
browser, etc.). 


The “‘added value” of a smart card is such a scenario 
is, largely, that it makes it hard to compromise the cryp- 
tographic key, but the card contributes little to the ac- 
tual trustworthiness of an individual digital signature: 
The card is used as a tamperproof device that executes 
a fixed computational function, ie., it reads a data block, 
encrypts it, and it returns the result. The card itself, 
however, does not interact directly with the user (card 
holder), but through a mediator like a PC or a mobile 
phone. But these devices are usually a lot less secure 
that a typical smart card. 


This problem is, in theory, easy to solve: Raise the 
security level and require a closed, trustworthy system 
for applying electronic signatures. Unfortunately, this 
solution is extremely hard to roll out in practice, both 
because it is expensive and since dedicated hardware, 
which would be required, simply does not fit into today’s 
computing world. 


We propose to take another direction, and build upon 
execution platforms that are provided by today’s smart 
cards, in particular SIMs and USIMs used for GSM and 
UMTS. Such cards offer functionality beyond the ‘hard- 
wired”, secure token that smart cards are mostly fig- 
ured: Besides holding a secret key and performing cryp- 
tographic algorithms, GSM SIMs and UMTS USIMs in- 
clude application platforms (e.g. [7, 15]), that allow pro- 
grams that run inside these smart cards to use services of 
its host. A mobile phone hosting such a SIM provides 1/O 
and networking capabilities to the SIM over standardized 
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protocols [6, 3, 4, 5, 1, 2]. As a result, applications run- 
ning inside a SIM can actively initiate and control user 
interaction, communicate over the network, etc. 

Our paper discusses the various options for enhancing 
the security of the process of signing a document by in- 
volving a secure execution platform in such a smart card; 
essentially we investigate the following question under- 
lying such an approach: 


How much in terms of security can be ob- 
tained, if as much functionality as possible 
is shifted from untrusted components into a 
trustworthy platform available in a tamper- 
resistant device? 


Overall, our research provides means for increasing the 
trustworthiness of digital signature by imposing less as- 
sumptions on the integrity of a card terminal that classic 
approaches do. 


Paper Outline 


The main research contribution of our paper is given 
within Sect. 2. After introducing some notational con- 
ventions with standard digital signatures in Sect. 2.1, we 
investigate basic, on-card hash computation in Sect. 2.2. 
Section 2.3 extends this by involving a trusted third party. 
A third approach integrating the identity of the docu- 
ment’s originator into the signature protocol is presented 
in Sect. 2.4. Although all the approaches are vulnerable 
to so-called “conspiracy attacks” they represent signifi- 
cant improvements in the overall security of an electronic 
signature creation process. 

Based on the results of the previous approaches 
Sect. 2.5 proposes to digitally sign user interactions trig- 
gered by scripts that run inside smart cards to enable 
the comfortable, application-driven creation of electronic 
signatures on mobile devices. 

Section 3 compares our work to related approaches, 
and we finally wrap-up our work in Sect. 4. 


2 Smarter Signing with Smart Cards 


This section explores several options for implement- 
ing the process of digitally signing documents by tak- 
ing advantage of secure application platforms in smart 
cards: We discuss the security benefits of moving more 
and more of the required computation into the secure en- 
vironment of a card. 

As the starting point, consider the “traditional” pro- 
cedure, where smart cards are used as crypto tokens 
holding a secret key and providing an implementation of 
cryptographic algorithms. 
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2.1 Basic Electronic Signature Protocol 


The most important roles in scenarios for electronic 
signature creation are the signer S owning a public key 
pair (S's, Ps), the document to be signed D, the signa- 
ture creation application A, a document viewer V inter- 
acting with the signer, a smart card C, and the originator 
O of the document D. The basic protocol is as follows: 


(1) OA: {D} 

(2) AepMeds td} 

(3) S,V—A: accept/reject 
(4) A—-C: {h(D)} 

(5) C+A-O: {sigs, (h(D))} 


Here, (1) denotes the document transfer from the origi- 
nator to the signature creation application, (2) the docu- 
ment presentation, (3) the signer’s interaction/choice, (4) 
the hash computation, and (5) the signing process. 

The above procedure can be improved wrt. security 
when moving some of these individual steps partially 
into the secure environment of a smart card. First we 
consider on-card hash computation. 


2.2 Electronic Signatures with On-Card Hash 
Computation 


The computation of the hash function is certainly a 
possible target for an attacker who wants to manipulate a 
signing procedure; but performing the hash computation 
inside a trusted device such as a smart card itself is not 
a panacea: it is important how the document presenta- 
tion and hash computation is done in the overall signa- 
ture protocol. Consider for example the following case: 

(1) AC: {D} 

(2) CA: {sigs,(h(D))} 
In this case, (1) denotes the document transfer to the 
smart card and (2) the document signing process. From 
a security point of view an intruder J who is in control of 
A can easily exchange document D with another docu- 
ment D’ which is subsequently sent to the card, hashed, 
and finally signed. Hence, compared with the basic pro- 
tocol, no additional benefit can be gained from moving a 
hash computation into a card in a straightforward way. 


On-Card Hash Computation Protocol 


Assuming a scenario in which the signature creation 
application A is located in a security module C, and the 
viewer in a (less trustworthy) terminal a possible proto- 
col is as follows: 


ay Gaye Ami 

(2) AoV,S: {D} 

(3) S,V—-»A: accept/reject 
(8) CAO: (sigs, (K(D))} 
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Here, (1) denotes the document transfer to the applica- 
tion being hosted by the smart card, (2) the document 
presentation, (3) the user’s choice, and (4) the hash and 
signature computation in the card. 


Assuming end-to-end secure communication between 
O and A/C, an intruder is not able to control the hash 
computation anymore. Only the document presentation 
and the user’s accept/res ponse could be manipulated, al- 
though the intruder controlling V cannot gain anything 
from such manipulation, except by mounting the follow- 
ing attack. 


A Conspiracy Attack on On-Card Hash Compu- 
tation 


e The intruder J and the originator O cooperate. 


e O sends the document D’, i.e., the document 
which the attackers want to be signed by S. 


e Upon invocation of V, J presents a fake document 
D, which S might accept for signing. 


e Inthe card, D’ is signed and sent back to O. 


Hence, an attack is still possible, if the intruder subvert- 
ing V and the originator O of the document directly co- 
operate. Although this attack is of general importance, 
practically, it means that it is not sufficient anymore to 
attack the user’s terminal only, but also to manage to 
actively send a faked document which the user subse- 
quently signs. 


As a consequence, shifting the hash computation in 
the above manner to a tamper-resistant device seems to 
give a substantial improvement in the overall security of 
the signature creation process. 


2.3 Electronic Signatures Assisted by a Trusted 
Third Party 


On-card hash computation is often not feasible, e.g. 
due to the limited bandwith one can use when commu- 
nicating with a smart card. The process of computing 
hashes can, however, also be delegated to a trusted third 
party T as the following protocol outlines. A URL urlp 
is used to denote some resource where D can be fetched 
from. The trusted third party T then computes D’s hash 
on behalf of A and signs it. A just forwards the URL 
to the document viewer V and the further protocol steps 
are the same as in the on-card hash computation protocol 


(cf. Sect. 2.2). 


(1) OA: {urlp} 

(2) AT: {urlp} 

(3) TA: {sigp(h(D))} 
(4) A-V,S: {urlp} 

(5) S,V-—-+A: accept/reject 
(6)  -A—+C: {sign(h(D))} 
(7) C3A 30: {sigg,(h(D))} 


In this protocol, (1) denotes the transmission of the URL 
under which the document to be signed is located to the 
application, (2) passing the URL to the TTP, (3) TTP 
fetches document and computes the hash, (4) represents 
the document presentation to the user, (5) the user’s 
choice, (6) pass-through of the TTP’s signature to the 
card and verification the the signature, and (7) the final 
signature computation by the smart card. 

Similar to the on-card hash computation protocol, it is 
vulnerable to a conspiracy attack as described above. 


2.4 Electronic Signatures with Recipient Ad- 
dressing 


Looking at the traditional signature creation proto- 
col it becomes obvious that authenticity of a document 
sender is not of particular concern. In electronic business 
processes, however, signatures are often used to provide 
the technical means for contracts between two parties. 
Although the identities of the contract partners are usu- 
ally somehow denoted in the document D, this is by no 
means cryptographically protected. 

To improve the signature process further, we include 
the cryptographic identity of the document originator 
into the signature process. In particular we propose the 
following protocol which is based on the on-card hash 
computation protocol (cf. Sect. 2.2) and the public key 
pair (So, Po) of the originator O denoted by ido: 


(1) OSA: {D,sigs,(D)} 
(2) A-V,S: {D,ido} 
(3) S,V4A: accept/reject 


(4) CAO: {sigs,(h(D), sigs,(D))} 


Here, (1) denotes the document and signature transmis- 
sion, (2) the presentation of the document and the iden- 
tity of the originator, (3) the user’s choice, and (4) the 
final hash and signature computation. 

This protocol now achieves that an electronic signa- 
ture is created over both — the cryptographic hash of the 
document and the identity of the recipient or originator 
of the signature. 

To assess the advantages of this approach consider 
that in a traditional signature attack scenario an in- 
truder could “hijack” the signing process of an arbi- 
trary document Do with its intended recipient O to 
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infiltrate another document D’ to be signed. The in- 
truder J could then claim that the user has signed this 
document which is likely of advantage to the intruder. 
In the above protocol, however, the intruder J is not 
able to obtain a signature sig. (h(D’), sigs, (D’)) since 
the signature sig,,(D’) cannot be generated. At best 
sigs, (h(D'), sigg,(D’)) could be obtained, but lead- 
ing to a contradiction between the information available 
in D’ denoting J as the recipient and the envelope sig- 
nature sigc¢,. Therefore, we argue that linking the docu- 
ment and the recipient in the signature gives advantages 
to standard electronic signature creation. 


Basically, the same conspiracy attack presented in the 
on-card hash computation in Section 2.2 can be mounted 
in the recipient addressing scheme. Again, if originator 
O and intruder J cooperate, the user is not able to distin- 
guish that signature creation occurs with a document that 
she does not intend to sign. 


2.5 Electronic Signatures on Interactions 


We have so far considered electronic signatures on 
standard clients, e.g. desktop PCs. One of the most prob- 
lematic issues with electronic signatures on mobile de- 
vices is the fact that such signatures are computed over 
complex documents. In particular this means that ac- 
cording to current signature laws, e.g. those in Germany, 
the document must be presented to the user who then ei- 
ther accepts or rejects the subsequent signature creation. 
Hence, a document to be signed must be presented as a 
whole in a suitably rendered fashion, which is often dif- 
ficult on small, mobile devices. The problem of encod- 
ing and subsequently displaying a document in a repro- 
ducible and standardized way has been extensively dis- 
cussed by Scheibelhofer [13]. In his approach he uses 
XML style sheets defining mappings to a possibly certi- 
fied rendering engine. 

To tackle this presentation problem, we consider not 
only the presentation of a document but also the way the 
document is created. We argue that a document is often 
the result of some kind of interaction between a service 
provider, e.g. who offers goods, and a client who selects 
goods to buy. Finally, after all selections are made, a doc- 
ument containing the complete list of goods is presented 
and signed accordingly. 

If such an interaction “document” is encoded as an ex- 
ecutable script, the execution of the script is deterministic 
as long as all non-deterministic input which is received 
from “outside” the script such as user input, random 
number generator, persistent variables, etc. is recorded. 
A “document” over which the signature is computed is 
then comprised of 





14 


CARDIS ’02: 58 Smart Card Research & Advanced Application Conference 


(a) the executed script, 
(b) the persistent state used during the computation, 
(c) all user input, 


(d) all messages received from other communication 
channels, 


(e) the current time and progress of execution, 


(f) some platform characteristics such as version 
numbers, serial numbers, etc. 


The signature can be easily verified by executing the 
script in a simulated environment using the recorded and 
signed input values. Thus, a signed document in this 
sense is not intended to be human-readable, but rather 
meant to record and log the interaction that happened be- 
tween a service provider and a user. 


A Smart Card Platform for Mobile Code 


More concretely, we propose to use a secure platform 
for the execution of (remote) code in a smart card which 
functions as follows: 


e The smart card implements an interpreter for mo- 
bile code written in a domain-specific language 
optimally supporting the intended application do- 
main. 


e Aclient such as a service provider sends messages 
containing so-called scripts written in the domain- 
specific language the card-resident interpreter un- 
derstands. 


e The card’s runtime platform executes the script, 
handles user interaction, and sends back the re- 
sponses to the client. 


e The platform implements key management facil- 
ities in order to provide end-to-end security be- 
tween the client and the smart card. 


Such a platform must be secure in the sense that neither 
the mobile code nor the user is able to harm the plat- 
form’s integrity. Furthermore, the platform gives certain 
guarantees to both - code and user — that the scripts are 
executed as intended and no information leakage or se- 
cret storage manipulation can occur by malicious code or 
an external attacker. 

Thus, the platform acts as a trusted computing base 


running in a tamper-resistant device protecting the user 
from the code and vice versa. 
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1 
2 
3 provider "bidbiz.cam"; 
4 name “bidbiz auction client"; 
5 id "20011223/24357"; 
6 options signed-interaction; 
J) 
8 implementation { 
9 playtone; 
10 push("News fran bidbiz.cam:\nBid in auction #3576 (Antique watch): EUR 63."); 
11 display; 
12 
13 push (mark) ; 
14 push("Place new bid?") ; 
15 push( "New bid. .."); 
16 push("Cancel") ; 
17 select; — User selects option: (int,‘1') | 
18 
19 push( 2 ); 
20 eq? ; 
21 if (true) goto end; 
22 
23 enter: 
24 push("Enter new bid (>EUR 63):"); 
25 input; +—-| User inputs new bid amount: (string, ‘70’) 
26 dup () ; 
27 push(63) ; 
28 le?; 
29 if (true) goto end: 
30 playtone; 
31 push("Please enter a bid greater than EUR 63."); 
32 display; 
33 goto enter; 
34 
35. end: sign-interaction; 
36 response; 
37 exit; 
38 } 
39 } 





Figure 1. Mobile auction client with interaction signatures 


Example: Mobile Auctions 


For illustration purposes we provide an example illus- 
trating our approach in the domain of mobile auctions 
(Fig. 1). This example is based on the one presented in 
[8], however, it has been extended to support the creation 
of signatures on user interaction. 

The given example illustrates the use of the stack- 
based domain-specific language we use to write our 
scripts without going into full detail. An in-depth de- 
scription of the language can be found in [10]. 

A script starts with header information about the name 
of the script and its provider (lines 3-5). Line 6 denotes 
that the script’s execution should be implicitly signed by 
the interpreter. The implementation part (line 7) 
contains the actual program. 

Lines 9-12 demonstrate how to display an initial 


message about the latest news of the online auction. 
Lines 14-18 show how the arguments for a user selection 
(primitive select) are pushed onto the stack marked by 
the initial marker set in line 14. After the selection has 
been performed the arguments including the marker are 
removed from the stack and the number of the selected 
item is available on the stack. 


Lines 20-22 check, whether the subscriber selected 
item no. 2 (i.e. “Cancel”’) in which case a jump to the 
label ‘end’ is performed. Otherwise an input dialog 
is opened in lines 25-26 and the input from the sub- 
scriber is returned on the topmost stack position and du- 
plicated in line 27. Then the entered amount is checked 
in lines 28-30, whether its is greater than 64. Otherwise 
a text is displayed in lines 31-33 and execution resumes 
to the input dialogue (label ‘enter’). 


Finally, in line 36, the whole recorded execution of the 
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script is signed and a signature object containing all the 
relevant information about the script’s execution includ- 
ing the signature is pushed onto the stack. The signature 
object is then sent back to the originator in line 37 and 
execution terminates. 

During execution the runtime environment collects 
the non-deterministic input from the various sources into 
alog L = {t,...,in} of inputs z;. In the above example 
execution thus yields 


L={Gint,1’), (string, 70")}, 


i.e., for each input we record the type information and 
the data. The overall interactive log of an execution of 
script P with the identifier id p is computed and returned 
together with additional platform information FR to the 
original sender S as follows: 


CS: {idp,L,R, sigs, (hash(idp, P,L, R)) }. 


The receiver must be able to verify the authenticity of 
the signature by simulating the execution of the script 
according to the log L. Based on this simulation, the in- 
teraction of the script and the user can be replayed and 
the user’s choices and inputs can be examined to take ap- 
propriate action. 

The execution of the script should occur in a trans- 
actional context, i.e. if for some reason the execution is 
terminated, no signature is created. 


Summary 


Using signatures on runtime execution audits com- 
bined with recording user interactions as a means to im- 
plement non-repudiation is, to the best of our knowledge, 
a novel approach. We consider this approach particularly 
useful for our application domain for the following rea- 
sons: 


e Due to the lack of user input and output facilities, 
performing all possible executions within the trust 
domain of the smart card is from a security point 
of view desirable. 


e All interaction which leaves the trust boundary of 
the smart card is reduced to the bare minimum, i.e. 
to user interactions only. 


e The approach is very flexible, since it offers scripts 
a full control over the way signatures are built, 
how encryption is performed, and how interaction 
takes place. As such it is able to offer applications 
means to implement security policies as needed. 


Thus, our approach allows to take full advantage of the 
smart card as an open platform for running security- 
critical applications in the tamper-resistant context of the 
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physical device. More precisely, it represents one in- 
stance of the on-card hash computation approach as pre- 
sented in Sect. 2.2. Furthermore, it can be easily ex- 
tended to also support the third-party assisted approach 
in Sect. 2.3 and the recipient addressing approach in 
Sect. 2.4 assuming available key management facilities 
as described in [10]. 


3 Related Work 


The main contribution of our research is in increasing 
the trustworthiness of digital signatures by building as 
much as possible on the security properties of execution 
platforms in smart cards. 

Alternatively, one can try to enhance the trustworthi- 
ness of devices; [11, 12] discuss portable end-user de- 
vices (POBs) and security modules and define a num- 
ber of requirements to be made for such devices. They 
observe that trustworthy POBs do not exist and con- 
clude that therefore the development of secure applica- 
tions should concentrate on protocols and procedures. A 
related approach is, e.g. described in [9] that comprises 
two different devices, a PDA and a smart card, that to- 
gether implement a security-sensitive application: the 
smart card does not perform its task without the PDA 
and the PDA cannot perform the task without the help of 
the smart card. 

One of the key ideas of this paper is documenting 
user interaction involved with digital signatures; a suit- 
able, lightweight scripting language suitable for on-the- 
fly download to smart cards has been proposed in [8, 10]. 

The actual runtime execution monitoring using an ex- 
ecution log has been investigated by Vigna as a means 
to protect the execution of mobile agents in hostile en- 
vironments [14]. The sender of a mobile agent can use 
this signed execution log to verify whether the agent has 
been tampered with while executing on a remote agent 
platform. 


4 Conclusion 


Starting from the observation that the process of elec- 
tronic signature creation is still vulnerable in many prac- 
tical settings, we have proposed three protocol variants 
that aim at shifting functionality from untrusted compo- 
nents into a smart card. 

The first option considers on-card hash computation 
combined with end-to-end secure transfer of the docu- 
ment to be signed from the originator to the smart card. 
Another approach uses a trusted third party to perform 
resource-intensive computation of the document's hash 
outside the card. The third approach is characterized by 
the integration of identity of the document’s originator 
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into the protocol eliminating further attacks. However, 
so-called “conspiracy attacks” in which an intruder and 
an originator cooperate are still, yet less easily, mount- 
able. 

Based on the new protocols a novel approach for the 
creation of electronic signatures based ona runtime exe- 
cution platform for smart cards has been presented. This 
approach is able to include the different protocol options 
presented and is especially suited for use in mobile set- 
tings characterized by the limited device capabilities in 
terms of user input and output. We have illustrated our 
approach with an example in the domain of mobile auc- 
tions — an application that is ideally suited to be run on 
mobile phones. 

Generally, we suggest that GSM SIMs and UMTS 
USIMs might be ideal candidates for hosting such a 
smart card platform. Our results demonstrate that im- 
provements in the electronic signature creation process 
are feasible if the environment the creation takes place is 
suitably taken into consideration. 
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Abstract 


The use of biometrics, and fingerprint recognition in 
particular, for cardholder authentication in smart- 
card systems is growing in popularity. In such 
a biometrics-based cardholder authentication sys- 
tem, sensitive data may be transferred between the 
smartcard and the card reader. In this paper we 
identify and classify possible threats to the commu- 
nications link between card and card reader during 
cardholder authentication. We also analyse the im- 
pact of these threats. We consider five different ar- 
chitectures and use the threat analysis to indicate 
the relative security of the various possible architec- 
tures. 


1 Introduction 


1.1 Biometrics and smartcards 


Biometrics has been widely recognised as a power- 
ful tool for problems requiring personal identifica- 
tion. Most automated identity authentication sys- 
tems in use today rely on either the possession of a 
token (magnetic card, USB token) or the knowledge 
of a secret (password, PIN) to establish the iden- 
tity of an individual. The main problem with these 
traditional approaches to identity authentication is 
that tokens or PIN/passwords can be lost, stolen, 
forgotten, misplaced, guessed, or willingly given to 
an unauthorised person. Biometric authentication, 
on the other hand, is based on physiological or be- 
havioural characteristics of the individual, such as 
fingerprints, and therefore does not suffer from the 
disadvantages of the traditional methods. 


In parallel, smartcards have steadily become more 
popular. Their increasing storage capacity and pro- 
cessing capabilities have enabled their deployment 
in a widening range of applications, varying from 
support for PKI to decentralised systems requiring 
off-line transactions [1, 2, 3]. Generally any ap- 
plication using smartcards requires a method for 
cardholder authentication, and biometrics-based au- 
thentication has emerged as an appropriate technol- 


ogy. 


Combining the security of biometrics and the com- 
puting power of a smartcard is a very elegant so- 
lution to cardholder authentication. On the one 
hand biometrics can provide the level of security 
required by applications using smartcards. On the 
other hand, smartcards enable the biometrics tech- 
nology by offering a secure and portable way of stor- 
ing the biometrics template, which would otherwise 
need to be stored in a central database. Fingerprint 
recognition appears particularly appropriate for use 
in biometric systems using smartcards. 


A smartcard system is composed of two main 
physical units: the smartcard itself and the card 
reader. In biometrics-based cardholder authenti- 
cation, transmission of sensitive data between the 
smartcard and the card reader may occur depending 
on how the biometric system is distributed between 
these two units. In this paper, we consider the secu- 
rity issues associated with the communications link 
between the smartcard and the card reader during 
the biometrics-based cardholder authentication pro- 
cess. 


Before we set out the objectives of this paper in 
detail, it is important to clarify the biometrics-based 
cardholder authentication process. 
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. 1.2 General model for biometric au- 


thentication 


According to [4], a general biometric system is com- 
posed of the following logical modules: 


1. Data collection subsystem; 
Signal processing subsystem; 
Matching subsystem; 
Storage subsystem; 


Decision subsystem; 


Da 7 fF w WN 


Transmission subsystem. 


The data collection subsystem contains the input 
device or sensor that captures the biometric infor- 
mation from the user. It is the link between the 
physical domain and the logical domain. The sig- 
nal processing subsystem receives the raw biomet- 
ric data from the data collection subsystem and ex- 
tracts the distinguishing features from the raw data, 
transforming it into the form required for match- 
ing. The matching subsystem receives the processed 
data from the signal processing subsystem and com- 
pares it with the biometric template retrieved from 
the storage subsystem. The matching subsystem 
measures the similarity of the submitted biometric 
sample with an enrolled reference template. Each 
comparison yields a score, which is a numeric value 
indicating how closely the submitted sample and the 
reference template match. The decision subsystem 
receives the score from the matching subsystem and, 
using a confidence value based on security risks and 
risk policy, interprets the result of the score, thus 
reaching an authentication decision. The transmis- 
sion subsystem provides the system the ability to 
exchange information between all other subsystems. 
Figure 1 shows a block diagram for the general bio- 
metric authentication model. 


Note that these are logical modules, and therefore 


some systems may integrate several of these compo- 
nents into one physical unit. 


1.3. Scope and purpose 


In this paper we focus on the security issues asso- 
ciated with the communications link between the 
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smartcard and the card reader during fingerprint- 
based cardholder authentication. PIN-based card- 
holder authentication has been well researched and 
understood, giving rise to a variety of industry stan- 
dards, such as [5, 6, 7]. Encryption is typically used 
to provide security for PINs during transmission, 
either from the keypad to the card (for local card- 
holder authentication) or from the keypad to a re- 
mote server (for remote authentication of the card- 


holder). 


However, for the purposes of our analysis, we do not 
make any assumptions about encryption or other 
cryptographic protection of the card/card reader 
communications link. This is because, whereas PINs 
are very short, biometric samples, e.g. fingerprint 
images, are rather large, and the limited compu- 
tational and storage capabilities of the card may 
severely limit the possibilities for such protection. 


Given our focus on card/reader communications, 
and the objective of assessing the best level of inte- 
gration of the biometric technology, we make certain 
other simplifying assumptions. We assume that the 
smartcard is a tamper-proof device and any trans- 
mission between biometric system modules taking 
place within the card is therefore secure. We do not 
discuss the impact of using fake biometrics, such as 
plastic fingers, to fool the system, although it was 
shown in [8] that this is a possible attack with the 
current technology. We feel that this issue concerns 
fingerprint-based biometric technology in a wider 
sense and is therefore beyond the scope of our dis- 
cussion. 


In previous related work [9], a number of weaknesses 
in the biometric system model have been identified, 
and countermeasures suggested. However, in that 
analysis no assumptions as to the actual architec- 
ture of the system are made, and the analysis is 
rather general in nature. By contrast, the main pur- 
pose of this paper is to understand what security 
gains can be made from the various possible levels 
of integration of the biometric system on the smart- 
card. 


Depending on how the logical modules of the bio- 
metric system are distributed between the smart- 
card and the card reader, different threats may arise. 
We consider five scenarios for the biometric system 
and, for each scenario, we identify and classify pos- 
sible threats to the communications link and assess 
the impact of these threats. In all scenarios we as- 
sume that the smartcard stores the template for the 
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Figure 1: General model for biometric authentication. 


cardholder fingerprint. We also assume throughout 
that fingerprint recognition is used as a method of 
cardholder authentication to the smartcard. 


In Section 2 we describe the five biometric system 
architectures considered in this paper. In Section 
3, we discuss the sources of communications link 
threats and then identify and classify the possible 
threats. In Section 4, we assess the impact of the 
threats identified in the previous section. Finally, 
we present our conclusions in Section 5. 


2 Possible biometric system scenar- 
ios 


Five different scenarios are considered, and the rela- 
tive risks associated with each scenario are analysed. 
The scenarios cover various possibilities for the dis- 
tribution of the modules of the biometric system be- 
tween the smartcard and the card reader. Note that 
in all cases we assume that the fingerprint template 
is stored in the smartcard. 


The scenarios are as follows: 


S1. The fingerprint sensor is built into the card 
reader. The user template is transferred from 
card to reader. The reader takes the image 
provided by its built-in fingerprint sensor, per- 
forms the feature extraction, and also matches 
the features to the template provided by the 
card. The reader then informs the card whether 
or not authentication has been successful. 


S2. The fingerprint sensor is built into the card. 
The fingerprint image and user template are 
transferred from card to reader. The reader 
performs feature extraction and matching of 
features to the template. The reader then in- 
forms the card whether or not authentication 
has been successful. 


S3. The fingerprint sensor is built into the card 
reader. The reader takes the image provided 
by the built-in fingerprint sensor and performs 
the feature extraction. The extracted features 
are sent to the card, which then performs the 
matching process and reaches the authentica- 
tion decision. 
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S4, The fingerprint sensor is built into the card. 
The fingerprint image is transferred from card 
to reader. The reader performs feature extrac- 
tion only, and transfers the extracted features 
back to the card. The card then performs the 
matching process. 


S5. All fingerprint processing takes place on the 
card. 


Figure 2 shows the first four scenarios and their cor- 
responding data flow during biometric cardholder 
authentication. Table 1 below defines all the scenar- 
ios in terms of the location of the various biometric 
modules. 


3 Security threats 


The focus of this paper is on the communications 
link between smartcard and card reader, and hence 
we only consider threats that relate, directly or indi- 
rectly, to this link. The main threats to this link can 
be divided into threats to the up-link (i.e. smartcard 
to reader) and down-link (i.e. reader to smartcard). 
The threats also vary depending on the scenario. 


Note that, before identifying the threats to the up 
and down links, we briefly consider the possible 
source of these threats. Also, as well as identify- 
ing threats to the up-link and down-link, we briefly 
consider threats to the card reader itself. This is be- 
cause the threats to the card reader indirectly relate 
to communications link protection (see below). 


3.1 Sources of communications link 


threats 


There would appear to be three main ways in which 
an attacker could intercept and/or manipulate data 
being transferred between card and card reader. 


e The card reader (and/or smartcard) may emit 
electromagnetic signals which are data depen- 
dent, and which can be intercepted using an an- 
tenna located close to the reader. Such an ap- 
proach would only enable passive (interception) 
rather than active (manipulation/replacement) 
attacks. The seriousness of this threat depends 
on the design of smartcard and card reader. 
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e A special interception device could be inserted 
into the read slot of the card reader, and the 
device would then be located between any in- 
serted smartcard and the card reader. By this 
means, and without any modifications to the 
card reader, both passive and active attacks 
may be realised. The seriousness of such a 
threat will depend on a variety of factors in- 
cluding the design of the card reader and the 
environment in which the reader itself is lo- 
cated. Observe that, given that the primary 
threat would appear to arise from an attacker 
equipped with a lost, stolen or borrowed card, 
the seriousness of this threat will relate to 
whether or not use of the card reader is super- 
vised by trusted personnel (who might detect 
the use of additional devices). 


e The card reader could be modified. At the sim- 
plest level this could mean the insertion of a 
‘bug’ designed to monitor and perhaps modify 
data communications. (See also Section 3.4 be- 
low). 


We do not discuss the magnitude of these threats 
further here, since all three threats are very much 
implementation-dependent and therefore any fur- 
ther analysis would be highly speculative. How- 
ever, it is clear that, wherever possible, card readers 
should be designed to minimise these threats, par- 
ticularly if sensitive information is transferred be- 
tween smartcard and reader without cryptographic 
protection. 


3.2 Up-link threats 


The main up-link threats are as follows: 


U1. (S1 and S2 only). Interception (leading to loss 
of confidentiality) of the user fingerprint tem- 
plate. 


U2. (S1 and S2 only). Manipulation (or replace- 
ment) of the user fingerprint template. 


U3. (S2 and S4 only). Interception (leading to loss 
of confidentiality) of the fingerprint image. 


U4. (S2 and S4 only). Manipulation (or replace- 
ment) of the fingerprint image. 
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Figure 2: Four different scenarios and their corresponding data flow during cardholder authentication. 
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Table 1: Five different biometric system scenarios in increasing order of integration of the biometric modules. 
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Threats U1 and U3 could be addressed by encrypt- 
ing the communications path, although the effec- 
tiveness of such a measure would depend on the 
physical security of the card reader (since keys nec- 
essary to decrypt the transferred data would need to 
be available to the card reader). Addressing threats 
U2 and U4 would require the provision of data 
integrity and origin authentication services for the 
data transfer between card and reader (e.g. as pro- 
vided by a Message Authentication Code (MAC) or 
a digital signature — see, for example, [2]). 


3.3. Down-link threats 


The main down-link threats are as follows: 


D1. (Sl and S2 only). Modification of the authen- 
tication decision. 


D2. (S3 and S4 only). Interception (leading to loss 
of confidentiality) of the fingerprint features. 


D3. (S3 and S4 only). Manipulation (or replace- 
ment) of the fingerprint features. 


Threat D2 could be addressed by encrypting the 
communications path, although the effectiveness of 
such a measure would depend on the means used to 
protect the necessary key(s). Addressing threats D1 
and D3 would require the provision of data integrity 
and origin authentication services for the data trans- 
fer between reader and card (e.g. as provided by a 
MAC or a digital signature). 


3.4 Threats to card reader 


Three main types of threat to the card reader can 
be identified. Although these threats are not di- 
rectly relevant to smartcard/reader communications 
security, they do have indirect relevance (see below). 
The three main classes of threat are as follows. 


e Manipulation of a genuine card reader. This 
includes the insertion of a ‘bug’ (as mentioned 
in Section 3.1), but also includes threats where 
the operation of the reader is modified, e.g. by 
changing stored software. 


e Replacement of the card reader. This refers to 
the substitution of the genuine reader with a 
fraudulent replacement. (Whether or not this 
could be achieved without preventing correct 
operation of the system depends on both the 
card reader design and the design of the re- 
mainder of the system). 


e Theft and/or reverse engineering of the card 
reader. Such a threat could be very serious if 
the reader contains secrets on which the system 
security depends. 


4 Impact of security threats 


We next consider the impact of the various threats 
identified in the previous section. We divide this 
discussion into the following sub-categories: 


e threats arising from attempted use of a lost, 
stolen or borrowed card; 


e threats to integrity of card transactions; 


e threats to cardholder privacy. 


4.1 Use of lost, stolen or borrowed cards 


As a basis of this discussion we assume that the 
possessor of a misappropriated (lost, stolen or bor- 
rowed) card wishes to make use of this card, e.g. to 
perform some kind of transaction. In order to do 
so, he/she will need to find some way of ‘fooling’ 
the cardholder authentication process. 


There are a variety of ways this could be achieved, 
as follows. Note that in each case we indicate which 
threat identified in Section 3 above is giving rise to 
the issue. 


e Arising from U2 (and hence applying to S1 
and S2 only): replace the fingerprint template 
as sent on the up-link with a fingerprint tem- 
plate belonging to the possessor of the misap- 
propriated card. 


For such an attack to be viable, the attacker 
will need to have a fingerprint template for 
his/her own fingerprint in the format used by 
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the scheme. There are a number of possible 
ways in which this could be obtained. 


— If the attacker has his/her own card, this 
could easily be obtained by monitoring the 
output from the attacker’s own card. 


— If the attacker knows the type of finger- 
print reader in use (either built into the 
card reader (S1) or built into the card 
(S2)) and the method used to obtain the 
template, then the attacker could obtain 
a fingerprint reader of this type and use 
it, together with appropriate software, to 
compute a template. 


— The attacker could use a misappropriated 
card to obtain a copy (or many copies) of 
a fingerprint image (from threat U3 — ice. 
S2 only) for his/her own fingerprint. With 
knowledge of the method used to extract a 
template, together with appropriate soft- 
ware, the attacker could compute a tem- 
plate. 


If U2 is realisable, then this risk has to be clas- 
sified as high, since, for many systems, protect- 
ing one cardholder against another fraudulent 
cardholder is a necessary requirement. 


Arising from U4 (and hence applying to S2 
and S4 only): replace the fingerprint image as 
sent on the up-link with a fingerprint image be- 
longing to the legitimate cardholder. 


For such an attack to be viable, the attacker 
will need to have a fingerprint image for the 
genuine cardholder. There are a number of pos- 
sible ways in which this could be obtained. 


— From threat U3 (and hence applying to 
S2 and S4 only). Note that this would 
require threat U3 to be realised before the 
time of misappropriation. This may not 
be easy to arrange. 


— If the attacker knows how the fingerprint 
reader in use operates, and has access to a 
fingerprint image of some kind for the gen- 
uine user (e.g. by taking an image from an 
object touched by the genuine cardholder) 
then it may be possible to transform this 
latter image into one conforming to the 
scheme in use. 


If U4 and U3 are realisable for the same card, 
then this risk has to be classified as high. Note 


that even if both threats are realisable, success- 
fully taking advantage of both threats with re- 
spect to the same card may be much more dif- 
ficult. If U4 is realisable but not U3, then the 
risk is lower — say medium — depending on 
the details of the fingerprint imaging technol- 
ogy being used. 


Arising from D1 (and hence applying to S1 
and S2 only): change the authentication de- 
cision sent from reader to card from ‘Reject’ to 
‘Accept’. 


This is trivially easy to perform (given that 
threat D1 is realised). If D1 is realisable, then 
this risk has to be classified as very high. 


Arising from D3 (and hence applying to S3 
and S4 only): replace the fingerprint features 
sent on the down-link with features extracted 
from the cardholder’s fingerprint. 


For such an attack to be viable, the attacker 
will need to have a copy of fingerprint features 
for a fingerprint of the genuine cardholder in 
the format used by the scheme. There are a 
number of possible ways in which this could be 
obtained. 


— From threat D2 (and hence applying to 
S3 and S4 only). Note that this would 
require threat D2 to be realised before the 
time of misappropriation. This may not 
be easy to arrange. 


— If the attacker knows how the fingerprint 
feature extraction method in use operates, 
and has access to a fingerprint image of 
some kind for the genuine user (e.g. by re- 
alising threat U3 before the time of mis- 
appropriation, or by taking an image from 
an object touched by the genuine card- 
holder) then it may be possible to derive a 
workable set of features conforming to the 
scheme in use. 


If D3 and D2 are realisable for the same card, 
then this risk has to be classified as high. Note 
that even if both threats are realisable, success- 
fully taking advantage of both threats with re- 
spect to the same card may be much more dif- 
ficult. If D3 is realisable but not D2, then 
the risk is lower — say medium — depending 
on the details of the fingerprint imaging and 
feature extraction technology being used. Note 
also that threat D3 could be reduced if a secret 
feature extraction technique is used — for this 
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to be effective the card readers in use would 
need to possess physical security features (see 
Section 3.4). Threat D3 could also be reduced 
(if not eliminated) if it was possible for the card 
to verify that the fingerprint features provided 
by the card reader indeed belong to the image 
provided to the card reader. 


Note that none of these threats apply to scenario S5, 
which is not prone to attack on the communications 
path since this path is not used for the cardholder 
authentication process. 


We summarise the results of the above analysis in 


Table 2. 
4.2 Card transaction integrity 


Whilst there may be many risks to the integrity of 
card transactions, we restrict our attention here to 
the impact of threats to the card/reader communi- 
cations link. 


The only impact which results from the analysis de- 
scribed here is an indirect one. If any of the threats 
relevant to the particular scenario are realisable, 
then this may give a cardholder the ability to dis- 
pute transactions after they have occurred. That 
is, if a fraudulent cardholder knows of the existence 
of certain threats which would allow successful use 
of a lost, stolen or borrowed card, then the card- 
holder could, after completion of a genuine transac- 
tion, claim that the transaction had been performed 
by someone else using a lost, stolen or borrowed 
card. 


4.3. Cardholder privacy threats 


The other main area of impact of the threats identi- 
fied in Section 3 is to the privacy of the cardholder. 
That is, the cardholder may have concerns relating 
to who has access to information relating to his or 
her fingerprint. Note that we are concerned here 
purely with privacy concerns, unrelated to any pos- 
sible threat of fraud. 


The following impacts arise from the identified 
threats. 


e Arising from U1 (and hence applying to St 
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and S2 only): loss of confidentiality of user fin- 
gerprint template. 


e Arising from U3 (and hence applying to S2 
and S3 only): loss of confidentiality of user fin- 
gerprint image. 


e Arising from D2 (and hence applying to S3 
and S4 only): loss of confidentiality of user fin- 
gerprint features. 


The three impacts are rather similar to one another, 
and all have an impact on user privacy. The choice 
of scenario (apart from the fact that S5 is unaf- 
fected) has little bearing on the degree of the im- 
pact. 


5 Conclusions 


The main purpose of the analysis in this paper is to 
understand how to integrate biometric cardholder 
authentication with a smartcard in the most cost ef- 
fective manner. In particular we have sought to un- 
derstand what is to be gained from the various pos- 
sible levels of integration of biometric system with 
the smartcard. 


First and foremost it is clear that scenario S5 is un- 
affected by the security of the communications path 
since in that scenario the card/reader communica- 
tions path is not used (at least for the cardholder 
authentication process). Thus scenario S5 is clearly 
the best in an absolute sense — however it is also 
likely to be the most costly to deploy. It is there- 
fore interesting to understand how the other four 
scenarios compare, bearing in mind that, of these 
four, scenario S1 is likely to be the cheapest option 
and scenario S4 the most expensive, since they rep- 
resent the lowest and the highest level of integration 
respectively. 


For the other four scenarios, the cardholder privacy 
threat is very similar regardless of the scenario. The 
main issue would appear to be fraudulent use of 
misappropriated cards. 


Of scenarios S1, S2, S3 and S4, it would ap- 
pear that scenarios S1 and S2 are very similar 
with respect to their vulnerability to attacks on the 
card/reader communications path. The degree to 
which scenarios S3 and S4 reduce the risk depends 
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Scenario 





















Very high (if D1 realisable). 
High (if U2 realisable). 
Very high (if D1 realisable). 
High (if U2 realisable). 
High (if U3 and U4 realisable for the same card). 
Medium (if U4 realisable). 
High (if D2 and D3 realisable for the same card). 
Medium (if D3 realisable). 

High (if U3 and U4 realisable for the same card). 
High (if D2 and D3 realisable for the same card). 
Medium (if D3 realisable). 















Table 2: Summary of impacts of misappropriated cards. 


partly on technical issues relating to the format and 
use of fingerprint images and features, and also de- 
pending on how easy it would be to both steal a 
card and monitor its use prior to its theft. 


S4 represents a higher level of integration of the bio- 
metric system with the smartcard than S3. How- 
ever the integration of the fingerprint sensor with 
the smartcard in S4 makes the system vulnerable 
to threats U3 and U4 in the uplink. From that 
point of view, S4 would appear to be an architecture 
more open to attacks than $3. Note, however, that 
when the fingerprint sensor is built into the card 
reader, the system becomes vulnerable to threats to 
the card reader (see Section 3.4). As suggested in 
[9], a fake card reader could be used to record the 
biometric data of legitimate users in an attack sim- 
ilar to a false ATM attack, which may potentially 
be an attack more easily realisable and more dam- 
aging than threats U3 and U4. Moreover, given 
that the sensor is a fragile piece of equipment, in- 
tegrating the sensor with the card reader is not a 
viable solution for many applications since it makes 
the system vulnerable to vandalism. 


The gain to be derived from integrating the finger- 
print sensor with the smartcard is minimal if all fin- 
gerprint feature extraction and matching are done 
off the card. However, depending on the environ- 
ment, significant gains can be achieved as long as 
the matching is performed on card, even when the 
feature extraction is performed off-card. 


It is interesting to note that almost all the most se- 
rious threats arise from an assumed lack of integrity 
for the data link. If it is assumed that the card 
reader is a trusted device and has not been inter- 


fered with or replaced (see also Section 3.4), then 
guaranteeing the integrity of the link between the 
card reader and the card would effectively prevent 
all the threats, even in the absence of any confiden- 
tiality for data transferred. 


Finally note that, given that the threats discussed 
mostly relate to use of misappropriated cards, the 
use of secure auditing and blacklisting measures 
within the application can help to minimise the im- 
pact of such threats. 
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Abstract 


In this paper we describe the Secure 
Method Invocation (SMI) framework imple- 
mented for JASON, our Javacard As Secure 
Objects Networks platform. JASON real- 
ises the secure object store paradigm, that 
reconciles the card-as-storage-element and 
card-as-processing-element views. In this 
paradigm, smart cards are viewed as secure 
containers for objects, whose methods can 
be called straightforwardly and securely us- 
ing SMI. JASON is currently being developed 
as a middleware layer that securely intercon- 
nects an arbitrary number of smart cards, 
terminals and back-office systems over the 
Internet. 


1 Introduction 


JavaCard! [Che00] technology makes it 
possible to develop software for a smart card 
using a high level language: JAvA. This 
technology is platform independent, it can 
handle multiple applications (each running 
securely within its own sandbox) on one 
smart card, post-issuance applications can 
be added to it and it is compatible with inter- 
national standards like ISO7816 [ISO7 816]. 


*Id: javacard-smi.tex,v 1.10 2002/09/23 06:04:03 
hoepman Exp 
'http://java.sun.com/products/javacard 


In fact, the JavaCard platform brought 
high level, Object Oriented Programming 
(OOP) to the smart card developer. Unfor- 
tunately, the OOP paradigm is only applied 
to the software within the smart card it- 
self: invoking methods implemented by ob- 
jects on the smart card still requires the 
developer to send commands to the smart 
card using Application Protocol Data Units 
(APDU's) [ISO7816], which have to be pro- 
cessed and transformed into method calls 
‘by hand’. 


It would be much more natural to view an 
object stored on a JavaCard as a remote ob- 
ject, accessible through a remote method in- 
vocation mechanism. In fact, if we look at 
a smart card application at a higher level of 
abstraction, we basically see a large collec- 
tion of interconnected objects. Some of these 
objects are stored in back offices, others in 
terminals or PC’s and many more stored se- 
curely on millions of smart cards. This net- 
work is highly dynamic: smart cards are 
usually offline, and only connect to the net- 
work when they are inserted into a terminal 
(or when they connect to a terminal over a 
wireless interface in the case of contactless 
cards). Much more importantly, this network 
needs to be highly secure. Access to certain 
objects should be restricted, and the confid- 
entiality and authenticity of the communic- 
ation between the objects has to be guaran- 
teed. 


Hartel et al. [HJF95] pose that a smart 
card should be seen as a processing element 
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rather than a storage element (as is tradi- 
tionally done). In our opinion these views 
are not contradictory at all, but rather sup- 
plement each other nicely in the secure ob- 
ject store paradigm. In this paradigm, smart 
cards are viewed as secure containers for ob- 
jects, whose methods can be called straight- 
forwardly and securely using Secure Method 
Invocation (SMI). We are currently develop- 
ing the Javacards As Secure Objects Network 
(JASON) platform as a middleware layer (on 
these smart cards, terminals, PC’s and back 
office systems) to support this paradigm. By 
simplifying the communication with a smart 
card, and by providing extensive support to 
secure this communication, JASON aims to 
greatly simplify the development of smart 
card applications. 


In this paper we will describe the JASON 
Secure Method Invocation (SMI) scheme. In 
this scheme, a JASON definition file (JDF) (re- 
sembling a JAVA interface with some addi- 
tional keywords) is used to specify the access 
conditions on methods of an object. It also 
specifies how the parameters of a method 
call and the result should be protected when 
transmitted between caller and callee. The 
JDF is compiled into a stub (used by the 
caller to set up a connection with the object 
and to call its methods) and a skeleton (used 
by the callee to accept incoming method in- 
vocation requests and to handle the secur- 
ity requirements). The big advantage is that 
the smart card application developer only 
needs to specify the security requirements, 
but does not have to implement the security 
protocols himself. This is done automatic- 
ally, given the requirements. 


The remainder of this paper is organised 
as follows. We first present related research 
in the next section. Then, the main require- 
ments for the JASON platform are presented 
in Sect. 2. The design (in terms of the applic- 
ation programmers view on JASON) is given 
in Sect. 3. Section 4 discusses the architec- 
ture and the way the JASON SMI is actually 
implemented, while Sect. 5 presents a small 
example of using JASON to implement a ba- 
sic electronic purse. Finally, conclusions and 
issues for further research appear in Sect. 6 
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1.1 State of the art 


Itoi et al. [IFHOO] add security to the Inter- 
net infrastructure for smart cards developed 
by Guthery et al. [Gut00, GBPROO] and Rees 
et al. [RHOO], adding the Simple Password 
Exponential Key Exchange (SPEKE) protocol 
and using the DNS as a location independent 
naming scheme for the smart cards involved. 
These aspects will be taken into account in 
the networking and naming part of the JASON 
platform. 


Hagimont and Vandewalle [DHOO] apply a 
different approach to enforcing access con- 
trol on (remote) objects. Their JCCAP system 
uses capabilities to specify which methods of 
an object can be accessed by the owner of 
that capability. Capabilities are implemen- 
ted through Java interfaces, and provide a 
limited view on the full interface of an as- 
sociated object. This makes their system 
dynamic (in the sense that capabilities can 
be added and removed from the system in- 
dependent of the actual implementation of 
the object, and that capabilities can be del- 
egated between objects. On the other hand, 
they do not consider the general case of 
caller and callee residing on different sys- 
tems separated by a network (as well as the 
terminal/card line interface). Moreover, the 
very important matter of protecting the data 
transfered with an actual method call is not 
considered in their work. 


The latest JavaCard specification (2.2) in- 
cludes a lightweight version of Sun’s Remote 
Method Invocation (RMI) [Sun99]. It provides 
a mechanism for a client application running 
on the terminal to invoke a method on a re- 
mote object stored on the card just like an 
invocation within the same virtual machine. 
The parameters of a remote method should 
be primitive (byte, boolean, short, int) or a 
single-dimension array of a primitive type 
(byte[], boolean{]}, short[]}, int[]). Unlike stand- 
ard Java RMI, object parameters (whether re- 
mote or not) are not allowed. The.method 
result is of primitive type, a single-dimension 
array of primitive type, a remote interface ob- 
ject or void. All parameters and return values 
are transmitted by value, except for the re- 
mote object. The remote object is transmit- 
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ted by reference. We have investigated sev- 
eral approaches to implementing our JASON 
Secure Method Invocation (SMI) system using 
RMI, but none are quite satisfactory. We dis- 
cuss this in Sect. 4.4. 


Keht et al. [KRVOO] describe the JiniCard 
architecture, which allows seamless integra- 
tion of smart card services in a spontaneous 
network environment. The approach taken 
is to keep all functionality required to inter- 
act with a certain smart card remotely on the 
network, and to download this functionality 
into the card reader based on the ATR (An- 
swer To Reset) of the particular card inser- 
ted into it. They also discuss the service-as- 
object metaphor, but as far as security is con- 
cerned, they consider SSL sessions between 
card and terminal objects over which RMI 
calls are being sent. We, on the other hand, 
introduce a much finer security granularity 
at the method level. 


There are also a number of related in- 
dustry initiatives that deserve to be men- 
tioned here. 


The Global Platform Specification? 
(formerly VISA'’s Open Platform specific- 
ation) is concerned with the secure and 
platform independent installing and deletion 
of applications on multi-application smart 
cards. 


The Open Card Framework? (and the sim- 
ilarly motivated PC/SC Workgroup*) aims to 
allow software developers to build smart 
card-aware products without having to worry 
about platform, card terminal, or smart card- 
specific interfaces. It supplies an API for 
handling the communication between a PC 
application and a smart card reader. Since 
OCF is developed by the major smart card 
companies, it supports all kinds of smart 
cards and card readers. The application does 
not even have to know which smart card 
reader is being used during a communication 
session with a card. OCF does not specify the 
card side. The choice of a particular type of 
smart card is free and may change without 
changing the PC application. 


2http://www.globa Iplatform.org 
3http://www.opencard.org 
*http://www.pcscworkgroup. com/ 


2 Platform requirements 


With the JASON SMI system we want to 
achieve: 


e Separation of concerns: specifying se- 
curity requirements (in the interface 
definition of a JAVA applet using our 
keyword approach), and their actual 
implementation (provided once through 
the JASON SMI system). 


e Generic secured access to objects and 
their methods, independent of their loc- 
ation and whether they are on a compute 
server or a smart card. 


e Providing generic, interoperable, tools to 
secure method invocations, which can 
be shared among objects (decreasing the 
code size) and which can be verified 
once (increasing robustness and avoid- 
ing repeated verification of similar per- 
applet security measures). 


e Decreasing the complexity of writing se- 
cure (smart card) applications. 


3 Design 


The JASON platform implements the se- 
cure object store paradigm using the follow- 
ing layers. 


Network layer Implements the direct con- 
nection between clients, servers, ter- 
minals and smart cards, using the In- 
ternet Protocol. Between terminal and 
smart card IP packets are transferred 
as APDU's. In particular, a smart 
card (when inserted in a terminal) has 
an IP address, and the terminal acts 
as a gateway relaying all incoming IP 
packets to the appropriate smart card 
(it may contain more than one smart 
card) [GBPROO]. 


remote method invocation layer Serialises 
method parameters into bytestreams 
and vice versa, and executes the call on 
the remote method 
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secure method invocation layer Provides 
access control and data confidentiality 
and authenticity. 


In this paper we will focus on the design of 
the secure method invocation layer, and de- 
scribe it as seen from the application pro- 
grammer’s point of view. We will discuss the 
close interdependencies with the RMI layer. 
The SMI layer only requires of the underlying 
layers that it delivers messages at least to the 
intended recipient. 


3.1 Main components 


The Secure Method Invocation (SMI) layer 
allows a caller object to securely call a 
method implemented by a callee object. Both 
caller and callee are assumed to be stored 
and run in a protected environment (a sand- 
box) that disables access to all objects and 
data within the sandbox except through pub- 
lished interfaces. 


The JASON SMI layer provides the following 
services: 


e identification and authentication of 
caller and callee, 


e role based access control at the method 
level, and 


e confidentiality and authenticity of 
method parameters and results. 


In future versions other services will be ad- 
ded like: 


e logging 
e transaction support 


e non-repudiation 


To call a method of an object, the caller first 
has to connect to the callee in a particu- 
lar role. This establishes a security context 
between caller and callee, that (among oth- 
ers) contains the session keys used to pro- 
tect the communication. Once connected, 


the caller can call all methods declared by 
the object accessible to this role. For JASON, 
roles are equivalent to keys. In other words, 
ownership of a particular key associated to 
a role, proves that an object can connect in 
that role. 


To establish a connection, the caller needs 
a stub corresponding to the object to con- 
nect to. Similarly, the callee needs a skel- 
eton that receives incoming connections, per- 
forms access control decisions and protects 
the method parameters and results. The role 
keys used to authenticate the caller to the 
callee are stored ina separate keystore object 
belonging to the same sandbox. This design 
is sketched in Fig. 1. 


The stub and skeleton necessary to se- 
curely call the methods of an object are gen- 
erated automatically from a so called JASON 
definition file. This file specifies the security 
requirements for the callee object. The con- 
tents and structure of this file are described 
next. Note that the issue of key management 
falls beyond the scope of this paper. We are 
currently investigating the proper tools to 
support key management within the JASON 
framework. As far as the JASON SMI platform 
is concerned, the keystore contains valid and 
proper keys. 


3.2 The JASON definition file 


The JASON SMI system has a strict separ- 
ation between the card application and its 
security. An application developer has two 
tasks. 


e Write a card object without bothering 
about security or APDU exchange, in- 
stead focusing on the information pro- 
cessing logic of the application. 


e Write a JASON definition file describing 
the security requirements. 


Therefore, the security requirements for an 
object are written in a separate JASON defini- 
tion file that resembles the syntax of a JAVA 
interface description. 
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callee 


dispatcher 


Figure 1: Caller and callee components. 


package com.ebank; 
public interface Purse 
{ 
roles BANK, MERCHANT, OWNER; 


accessible to ALL 
authentic short getBalance(); 


accessible to BANK 


authentic short increaseBalance( confidential authentic short amount); 


accessible to MERCHANT 


authentic short decreaseBalance( authentic short amount); 


Figure 2: JASON definition file for a simple purse. 


A sample JASON definition file appears in 
Fig. 2 (describing the interface of a simple 
electronic purse application, that will be 
studied further in Sect. 5). The JASON pre- 
compiler will process the definition file and 
generates three files. 


e Aplain JAVA interface file. All keywords 
not known in JAVA are removed. This is 
the interface implemented by both the 
implementation of the callee object and 
the client stub. 


e A client/caller stub, whose methods are 
called to execute the corresponding re- 
mote methods, and that performs au- 
thentication and marshalling (including 


protection) of data. 


e A callee skeleton performing access con- 
trol decisions, unmarshalling of para- 
meters (verifying signatures and de- 
crypting parameters where necessary) 
for incoming invocation requests, and 
executing the actual method. 


In JAVA the’ keywords private, 
protected and public are used to limit 
access to methods and fields to certain 
classes. An object can only access it’s own 
private members, protected members of it's 
superclasses or classes in the same package 
and all public members. These keywords 
work fine if used inside a single virtual 
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machine. However, when using a distributed 
system a more fine grained solution is 
necessary. 


In the JASON SMI system, access control 
is role based. Moreover, the communication 
between caller and callee has to be protected 
as well. To specify these requirements, the 
JAVA interface description is extended with 
the following keywords. 


e roles (role-list), listing the different 
roles in which a caller can connect to 
this object. The roles in this list corres- 
pond to keys stored in the keystore. 


e accessible to (role-list), specifying 
which roles can call the indicated 
method. 


e confidential and/or authentic, spe- 
cifying that a parameter or a method 
result should be confidential and/or au- 
thenticated. 


Here a (role) is an identifier (usually in all 
caps because it is a constant), and a (role- 
list) is a comma-separated list of roles. Let 
us discuss the last three keywords in a little 
more detail. 


accessible to (role-list) Access to 
a method can be limited by using the 
accessible keyword. Access is only to be 
granted if the caller can be identified (using 
the corresponding keys in the keystore) as 
a role in (role-list). The predefined role ALL 
indicates that access is allowed for all roles 
defined for this object (through the roles 
keyword). The predefined role ANYBODY spe- 
cifies a role that can be assumed by anybody 
(i.e., arole whose identity is not verified). For 
security reasons only methods are accessible 
from off-the-card applications. Variables 
should be accessed through corresponding 
set and get methods. 


confidential Parameters and return val- 
ues can be specified as confidential, mean- 
ing that the data involved should be sent en- 
crypted between caller and callee. This guar- 
antees that nobody else can eavesdrop the 


value. In the negotiation phase (see below) 
a (symmetric) session key is exchanged and 
an encryption algorithm chosen. 


authentic Parameters and return values 
can also be specified as authentic. This 
gives the following guarantees. 


authenticity Only the caller can construct 
valid parameters®, and only the callee 
can construct valid responses. The para- 
meter received by the callee was sent by 
the caller, and the result received by the 
caller was sent by the callee. In par- 
ticular, this gives the caller the guar- 
antee that the intended side effects of 
the method call did in fact occur at the 
callee (like decreasing the balance of a 
purse). 


integrity The parameter (or the result) re- 
ceived was not altered while in transit. 


freshness The parameter received was 
passed by the caller for the current call 
of the method (and not for any previous 
call). The result received was sent by the 
callee for the current call of the method 
(giving the guarantee that the method 
was actually executed at this time, see 
above). 


In practice this means that the data involved 
should be signed, and that a form of replay 
protection is added as well. 


3.3. Using SMI 


To call a method using the SMI framework, 
the caller has to perform the following two 
steps (see also Fig. 3 for an example connect- 
ing to the purse object whose interface was 
given previously). 


e The first step is to connect to the callee 
and to establish a security context. The 


5Strictly speaking, because a symmetric session key 
is used to protect the data, also the callee can construct 
valid parameters. Therefore non-repudiation cannot be 
guaranteed. 
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try { 


Purse purse = (Purse) SMINaming.connect("smi://smartcard/Purse", 


Purse.MERCHANT, purseKeyStore) ; 
try { 
purse.decreaseBalance(10) ; 


System.out.printIn("You have paid”); 


} 


catch (UserException ue) { 


System.out.print|In("Transaction failed. You have not paid."); 


} 
} 


catch (RemoteException re) { 


System.out.printIn("Failed to connect to service."); 


} 


Figure 3: Caller connecting to a callee 


caller passes the name and location of 
the desired service, the desired role in 
which to connect, and a reference to the 
key store to SMINaming.connect(). When 
successful, this returns a reference to 
the required stub. 


e Subsequently, the methods of the re- 
mote object can be called securely as if 
they were local methods of the stub re- 
turned by the previous step. 


If a connection is established, the stub 
also contains the current security context 
for that connection. Among other things, 
this security context contains a session key 
used to secure subsequent method invo- 
cations. Also, it contains further identi- 
fication information on the callee object. 
This identity can be retrieved by the stub’s 
getSessionIidentifier() method. 


Note that evenfor a single call to a method, 
a connection has to be set up. This may 
be wasteful for certain applications where 
transaction speed is very important (e.g., 
public transport). We are investigating the 
possibility of calling a single method without 
connecting to the object first (in fact merging 
the connection and the calling into one step). 


4 Architecture 


In this section we describe how the JASON 
SMI platform is actually implemented, and 
how the security requirements are actually 
met using several cryptographic protocols. 
In particular we show how a secure connec- 
tion is setup, how the ownership of roles is 
verified, and how the security context is es- 
tablished. Secondly, we show how a method 
is called securely using the information and 
session keys in the current security context. 
But first we will discuss the keys stored in 
the keystore in a little more detail. 


4.1 Onkeys 


The keys in the keystore correspond one- 
to-one to the roles declared in the JASON 
definition file. The keystore also contains 
keys for key-management. This is discussed 
in a forthcoming paper. 


JASON supports the use of different types 
of keys in the keystore, depending on the se- 
curity requirements of the application (or in- 
deed individual objects on particular smart 
cards). Currently, the following types of keys 
are supported. 


e RSA, with 512, 1024 and 2048 bit keys. 
e DES and 3DES. 
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e AES, with 128, 192 and 256 bit KEYS. 


Moreover, JASON supports’ diversified 
keys [AB96] where the key k; stored by 
callee i (used by the callee to authenticate 
the caller or vice versa) is derived from the 
master key ky stored by the caller. The key 
is derived using the formula 


kj = {idey ’ 


where {m}, denotes encryption of message 
m using key k (where the encryption method 
is defined by the type of the key). Note that 
in this case k; performs the role of a public 
key (from which the corresponding private 
key cannot be derived), but with additional 
property that it proves to the caller the iden- 
tity i of the callee. 


Depending on the type of key stored in the 
keystore, the appropriate authentication pro- 
tocol is run. Note that the caller keystore 
contains the keys necessary to prove its role 
(e.g., private keys), while the callee keystore 
contains the keys necessary to verify a role 
(e.g., public keys). If an entry in the caller 
keystore is nul] or invalid, the caller cannot 
assume the corresponding role. If an entry 
in the callee keystore is nul] or invalid, the 
role cannot be verified and all connections 
for that role will be refused. 


Finally, the keystore contains, for each role 
key, information about the type of cipher 
that should be used to protect the session 
once the caller has been authenticated and 
accepted. 


4.2 Connecting to an object 


Connecting to an object exchanges and 
verifies the identity and role of the caller and 
the callee. Furthermore, a security context is 
established (containing a shared secret key) 
that is used to protect all calls to methods of 
the object. To connect to an object and estab- 
lish a session the following steps are taken 
(assuming RSA style authentication). 


e The caller sends a message containing 


- the role (as an index in the key- 
store) as which it wants to connect, 


the type of key it will use to au- 
thenticate the role (RSA in this ex- 
ample), 


a list of all ciphers it will accept to 
protect the session, and 


~ annonce. 


e The callee looks up the role and the type 
of keys it can accept. If it can accept 
the suggested authentication method, it 
will select one of the ciphers to pro- 
tect the session from the list it received 
(provided it supports it). It then sends 
the following message 


- the selected cipher to protect the 
session, 


- a random master secret encrypted 
with the public RSA key found for 
the role in the keystore, and 


- anonce, 


e The caller validates the proposed cipher, 
decrypts the master secret with its 
private key in the keystore. 


e Both caller and callee generate the ses- 
sion key (using hashes) from the master 
secret and both the caller and the callee 
nonces. 


e Caller and callee exchange further 
identifying information encrypted and 
MAC-ed using the session key, and re- 
cord that in the security context. 


Both caller and callee record the session key 
in the security context for this connection. 
Note that if a connection is established as 
ANYBODY, no verification of that role can be 
performed. In that case, the master secret 
must be exchanged using a Diffie-Helman 
type key exchange. Future method invoca- 
tions are will be secured using this session 
key. 


The session context also contains two 
counters, one to count the number of mes- 
sages sent in this session, and one to count 
the number of messages received. Both are 
reset to O at the start of a session, and incre- 
mented for each message sent or received. 
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These numbers are used to protect against 
replay, as explained below. 


4.3. Method invocation 


Informally speaking, after session setup 
the stub and the skeleton are connected by 
a (secure) byte stream. The byte stream 
is routed by the communications layer to 
the correct skeleton. In fact, when a stub’s 
method is invoked, it does the following: 


e reconnect to the remote JVM containing 
the remote object, 


e marshal(write and transmit) the para- 
meters to the remote JVM, 


e wait for the result of the method invoc- 
ation, 


e unmarshal (read) the return value or ex- 
ception returned, and 


e return the value to the caller. 


The stub hides the serialisation of paramet- 
ers and the network-level communication in 
order to present a simple invocation mech- 
anism to the caller. 


In the remote JVM, each remote object has 
a corresponding skeleton. The skeleton is re- 
sponsible for dispatching the call to the ac- 
tual remote object implementation. When a 
skeleton receives an incoming method invoc- 
ation it does the following: 


e unmarshal (read) the parameters for the 
remote method, 


e invoke the method on the actual remote 
object implementation, and 


e marshal (write and transmit) the result 
(return value) to the caller. 


The byte stream sent from stub to skeleton 
contains the following elements. 


e The name (or rather the index) of the 
method to call, together with a MAC 


computed using the session key and 
the current value of the sent messages 
counter. Even if RMI is used as the trans- 
port mechanism, this information is ne- 
cessary to prevent remote method in- 
vocations being redirected to the wrong 
method. 


e Each confidential parameter is encryp- 
ted. 


e For each authentic parameter, a MAC 
computed using the session key and 
the current value of the sent messages 
counter is appended to the parameter. 


For efficiency reasons parameters are 
shuffled so that the confidential and 
authentic parameters are placed in con- 
tiguous blocks within the byte stream (see 
Fig. 4). All confidential parameters are en- 
crypted as a single block. Similarly, the MAC 
for all authentic parameters is computed in 
a single block, appending the sent messages 
counter only once. 


The return stream from skeleton to stub to 
communicate results has the following struc- 
ture. 


e If the return type is confidential, the re- 
turn value is encrypted with the session 
key. 


e If the return type is authentic, the sent 
messages count is appended to the byte 
stream, and both the count and the 
value are used to compute a MAC with 
the session key. The result is appended 
to the byte stream. 


4.4 Inter object communication 


Because the caller and callee are physic- 
ally separated by a network, the call to a re- 
mote method must be transferred to the re- 
mote object over the network before it can be 
executed there. The most natural approach 
would be to use Java’s Remote Method In- 
vocation mechanism to achieve this. At the 
caller side, the SMI stub first converts the 
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method c " 


Ss 


authentic parameters 





confidential parameters 


legend: MAC over indicated bytes 
and message count using k 


Bytes encrypted using k 





Figure 4; Byte stream structure from caller to callee. 


parameters to a protected bytestream, as ex- 
plained in Sect. 4.3. The RMI layer than 
transmits this bytestream to the callee, and 
invokes the corresponding method of the 
callee SMI skeleton. There, the access per- 
missions are checked and the bytestream is 
unpacked before the original callee method 
is invoked. 


However, this scenario is complicated by 
the fact that JavaCard (as of version 2.2) 
uses a different RMI system, if only because 
a JavaCard is not connected to a network 
directly, but instead communicates with the 
outside world through a terminal using an 
APDU stream. This would imply that the ter- 
minal has to convert an incoming RMI re- 
quest to a JavaCard specific JC-RMI request 
(and similarly for the responses). This does 
not appear to be straightforward, because 
the RMI wire protocols are different. The 
only option is to create - for each skeleton 
on the callee smart card - a separate skel- 
eton (and stub) for the terminal, that receives 
the incoming RMI request and simply calls 
the remote method on the smart card using 
JC-RMI. This means the terminal potentially 
needs access to a huge number of skeletons 
and stubs, simply to pass bytestreams ver- 
batim! 


Moreover, we note that RMI’s support for 
marshalling and unmarshalling of method 
Parameters and results becomes totally su- 
perfluous in this approach, because the SMI 
layer already converts the parameters to a 
bytestream in the first place. 


To solve the first problem RMI and JC- 
RMI need to be brought more in line, such 


that their wire protocols become sufficiently 
compatible to allow translations between 
them using a generic translation mechan- 
ism running in the terminal. To solve the 
second problem, the RMI system should 
provide versatile hooks to allow the outgo- 
ing bytestream to be protected in the fine 
grained manner required by JASON. Or, SMI 
should be incorporated into the RMI layer. 


5 Example 


Fig. 2 in section 3.2 shows the security 
requirements of a simple purse application. 
It corresponds to the actual implementation 
given in Fig 5. Clearly the implementation is 
quite straightforward. Also, the strictness of 
the separation between implementation and 
its security is apparent. The implementation 
does not contain a single line of code con- 
cerning security. All the security is contained 
in the generated stub and skeleton. The skel- 
eton calls the implementation and adds se- 
curity to it. Note that each method is defined 
with the default JAVA visibility, to allow the 
skeleton to access them, but not giving ac- 
cess to subclasses outside the package. 


6 Conclusions & Further Research 


We are currently implementing the 
JASON SMI framework on a_JavaCard 
2.2 platform. The final implementa- 
tion will be available under the GNU 
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package com.ebank; 


class PurseImp] implements Purse { 


public static final byte OVERFLOW = (byte) 1; 
public static final byte UNDERFLOW = (byte) 2; 


private short balance = 0; 


private static final short MAX = (short) 500; 


short getBalance() { 
return balance; 


} 


short increaseBalance(short amount) throws UserException { 


if (balance + amount < MAX) { 
balance += amount; 
return amount; 

} else 


UserException.throwIt (OVERFLOW) ; 


} 


short decreaseBalance(short amount) { 


if (balance - amount > 0) { 


balance -= amount; 
return amount; 
} else 


UserException. throwItC(UNDERFLOW) ; 


Figure 5: Implementation of a simple purse. 


General Public License (GPL) through 
http://ww.cs.kun.n1/~jhh/jason. html 
within afew months. 


We intend to extend JASON’s SMI function- 
ality with logging and auditing functions, as 
well as transaction (and rollback) support. 
Related to the logging and auditing issue, 
is the fact that the current implementation 
does not provide non-repudiation. The rami- 
fications for implementing non-repudiation 
are the subject of further investigations. 
Also, one could argue that the authentic 
keyword is overloaded (in the sense that it 
gives too many guarantees, especially fresh- 
ness, at the cost of a more complex and 
resource consuming protection mechanism). 
Using JASON to develop several real-world 
smart card applications will tell whether a 
more fine grained set of security specifica- 
tion keywords is required. 


Finally, to make the JASON vision of a 
smart card application consisting of millions 
of distributed objects a reality, object broker 
functionality has to be added that is con- 
sistent with the high security requirements 
of typical smart card applications, and the 
highly dynamical nature of the smart card 
network. 
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Abstract. We present malicious 
insider attacks’ on chip-card per- 
sonalization processes and suggest 
an improved way to securely gen- 
erate secret-keys shared between 
an issuer and the user’s smart 
card. Our procedure which re- 
sults in a situation where even the 
card manufacturer producing the 
card cannot determine the value 
of the secret-keys that he per- 
sonalizes into the card, uses pub- 
lic key techniques to provide in- 
tegrity and privacy of the gen- 
erated keys with respect to the 
complete initialisation chain. Our 
solution, which provides a non- 
interactive alternative to authen- 
ticated key agreement protocols, 
achieves provable security in the 
random oracle model under stan- 
dard complexity assumptions. Our 
mechanism also features a cer- 
tain genericity and, when coupled 
to a cryptosystem with fast en- 
cryption like RSA, allows low-cost 
intrusion-secure secret key genera- 
tion. 


1 Introduction 


Tamper-resistant devices like smart-cards 
are used to store and process secret and per- 
sonal data. Examples of applications making 
extensive use of smart cards include wireless 
communication systems such as the Global 
System for Mobile communications (GSM), 


or banking systems using the EMV (Europay, 
Mastercard and VISA) standard. These ap- 
plications share the fact that they use secret 
key identification or authentication to achieve 
security and enable access to services. Thus 
some unique secret key A, (we will adopt the 
notation K, to denote a card’s secret key by 
analogy with the widely known GSM termi- 
nology) needs to be shared between the issuer 
(the bank or the telecom operator) and the 
smart card. Usually this secret key material is 
downloaded into the card during the so-called 
chip personalization phase, i.e. the initialisa- 
tion phase during which identical cards are 
configured in such a way that each and every 
of them corresponds to one specific user. 


Usually, the card personalization center ei- 
ther writes secret keys into the cards accord- 
ing to a list provided by the issuer, or gener- 
ates the keys itself and downloads them into 
the cards within it’s own premises, and sub- 
sequently transmits a list of (encrypted) keys 
to the issuer. We refer to these scenarios as 
typical personalization protocols. In the se- 
quel, we consider precisely the second sce- 
nario (key generation within the manufac- 
turer’s premises) and show that such a ba- 
sic personalization procedure is vulnerable to 
malicious insiders. 


We first discuss the potential security flaws 
in such a process, and then proceed to present 
a new personalization protocol in which the 
manufacturer is able to provide evidence to 
the issuer that no one except the issuer him- 
self knows the secrets stored inside the cards. 
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Thus our new technique provides generally 
trusted keys for secret key applications. 


The rest of the paper is organized as fol- 
lows. In Section 2, we give an overview of a 
typical personalization protocol, and we point 
out its vulnerability to insider attacks when 
appropriate physical site protection measures 
are not enforced. Section 3 proposes our new 
personalization procedure. We provide a thor- 
ough security analysis in section 4 and con- 
clude by a giving practical implementations 
of our technique in section 5. 


2 Personalization Protocols 


2.1 The current approach: typical 
protocols 


Card personalization involves three par- 
ties: an issuer (telecommunications operator 
or bank), a card manufacturer (who actually 
personalizes smart cards for the issuer), and a 
smart card. Beyond graphical personalization 
— which may consist in printing the issuer’s 
logo on the card for instance, the manufac- 
turer has to electrically initiate the card and 
among such tasks, initialize the files meant to 
contain the card’s secret key material K;. Ina 
typical scenario, each and every secret key K; 
is generated uniformly at random by a per- 
sonalization computer (PC) connected to the 
personalization system (such as a DataCard 
9000 machine). Whenever a card enters the 
system, a fresh random key K; is selected by 
the PC and downloaded into the card’s non- 
volatile memory. 


Simultaneously, the key gets encrypted on 
the PC, together with the card identifier 
Id (which might be some publicly available 
unique bitstring such as a serial number for 
instance) using the issuer’s secret key Ks. 
Lists of encrypted (/¢;, Id) pairs are then sent 
over an insecure channel to the issuer who de- 
crypts the received files and recovers the pairs 
(1, Id). 


Another way to proceed consists in en- 
crypting the generated keys with the issuer’s 


authenticated public key in an asymmetric 
key setting. This way, the issuer is the only 
entity able to decrypt the generated files, and 
the key , need not be known at the manu- 
facturer’s premises. 


However, both solutions are vulnerable to 
insider attacks where a malicious entity hav- 
ing access to the manufacturer’s premises 
would get hold of the key. We may think of 
a malicious insider as some malevolent em- 
ployee willing to clone SIM cards or as a 
hacker that discretely eavesdrops the com- 
puter network from outside the personal- 
ization center. This strongly motivates the 
search for protocols featuring a guaranteed 
level of security against this kind of threat. 


2.2 Security Notions for Card Per- 
sonalization 


Let us examine the setting and determine 
which security goals are desirable to reach 
from the issuer’s standpoint. When the per- 
sonalization protocol takes place, parties are 


— a tamper-resistant secret-less chip-card 
to be personalized with identifier Id, 

— an issuer (supposedly remote), 

— a _ personalization system (PC + 
DC9000) in which the issuer has no 
reason to put trust. 


Ultimately, the goals the personalization 
protocol is meant to achieve are the follow- 
ing. At the end of the process, 


1. the card must contain some secret key 
Kc; belonging to some fixed key space (we 
call this property correctness), 

2. the issuer must know the correct pair 
(Id, ;) (we refer to this property as key 
integrity), 

3. the issuer should be confident that he is 
the only entity who shares the knowledge 
of the (Id, A’) pair with the card (this is 
defined as key privacy). 


Correctness is easily achieved. The question 
is whether requirements 2 and 3 are actu- 
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ally achieved by current typical personaliza- 
tion protocols, and the answer is obviously 
no. The above protocols do not meet key in- 
tegrity nor even key privacy. Indeed, the com- 
puter, if handled by a malicious person, may 
very well generate a given /(; and transmit a 
different one to the issuer. This can be con- 
sidered a denial of service attack, as the end- 
user would get a non-functional card. Alter- 
natively, the computer might respect the in- 
tegrity property by providing the right pair 
(Id, K,) to the issuer, but reveal this pair to 
an intruder getting hold of the PC. In this 
case, card cloning becomes possible. We call 
such attacks ‘malicious insider attacks’. 


2.3. The Interlock protocol 


One obvious attempt to address this prob- 
lem consists in executing a key agreement 
protocol such as Interlock [4] between the 
card and the issuer. 


Interlock is described as follows. Assum- 
ing that two entities A and B, with public- 
keys pk, and pkg, want to exchange a se- 
cret through an insecure channel, A and B 
proceed as follows. First, A and B exchange 
their public keys through the channel. Then, 
A (resp. B) chooses a random ra (resp. rg), 
and encrypts it with pkg (resp. pkg) to ob- 
tain a ciphertext c,4 (resp. cB). c4 (resp. cg) 
is a bitstring which can be cut into two equal 
parts (c4,c4) (resp. (ch, c%)). Thus, A sends 
cl, to B, and sends the remaining part c4 only 
after having received c},. Finally, B sends ch. 
At the end of the sequence, A and B share 
the pair (r4,rTB). 


Clearly, this protocol thwarts passive man- 
in-the-middle attacks. However, it is interac- 
tive, which represents an unacceptable hur- 
dle in the context of a personalization pro- 
cess. The only way to achieve an equiva- 
lent non-interactive protocol would be to use 
public-key certificates and signature verifi- 
cation which calls for far too complex (and 
heavy) public-key infrastructures. 


Besides, security requirements explicitly 
demand resistance against active attacks, 
where the attacker may not only eavesdrop 


exchanged pieces of information but also 
modify them in some way, and may imper- 
sonate parties as well. Because it does not 
provide authentication, Interlock does not re- 
sist active attacks. 


The contribution of this paper consists in 
providing a non-interactive alternative to the 
Interlock protocol which, in our context, re- 
sists active attacks and needs no certificates 
or signatures whatsoever. 


3 A Provably Secure Card Per- 
sonalization Protocol 


Let us go back to the typical scenario. Ob- 
viously, the security breach resides in the 
possibility to attack the PC. Thus each and 
every secret should be generated inside the 
card itself, which, by assumption, provides 
the advantage of being tamper-resistant over 
an open PC. 


3.1 A First approach 


Thus a first idea is to generate the secret 
key Ky inside the card, download the issuer’s 
public key into the card, encrypt the gener- 
ated secret under the public key and output 
the result. Next, the encrypted secrets are col- 
lected along with the Jd’s in a file and sent to 
the issuer who decrypts the list with his pri- 
vate key SK and recovers the associated pairs 
in clear (alternatively the /d’s could also be 
encrypted together with the secret Ky inside 
the card). This protocol is shown in figure 1. 
The public key of the issuer is noted PK; typ- 
ically, one could use stand-alone RSA public 
key encryption [5] for instance. We suppose 
the key pair (PK,SK) is generated once and 
for all by the issuer himself, and then trans- 
mitted to the personalization center which 
uses it for a certain period of time. 


Unfortunately, this solution is vulnerable 
to the well-known man-in-the-middle attack. 
Suppose the attacker controls the PC again. 
She is then able to generate her own public 
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Fig. 1. Secure personalization protocol : first approach 


and private RSA key pair and to fool the card 
by sending to it her own public key. She recov- 
ers the encrypted i‘; values, decrypts them, 
and re-encrypts them with the issuer’s pub- 
lic key. Thus key integrity is preserved, but 
key privacy is violated. The attack is shown 
in figure 2 where the attacker’s public key is 
noted PK’. 


3.2 Proposed Protocol 


Let us now proceed to describe our pro- 
tocol. The security analysis will be discussed 
in the next section. Basically, the personaliza- 
tion process now includes the following steps : 


1. the PC transmits the issuer’s public key 
PK to the card, 

2. the card generates a random r, computes 
4K, = H(r, PK) where H is a hash func- 
tion such as SHA-1 [7], and memorizes 
i; in non volatile memory, 

3. the card encrypts r as c = Epx(r) where 
Ep« denotes public key encryption under 
PK, and outputs c, 

4. the PC collects the pair (Id, c) and sends 
it to the issuer who later decrypts c using 
SK, recovers r = Dsx(c) and computes 
Ky, = H(r, PK). 


This protocol is shown in figure 3. 


4 Security Analysis 


4.1 Main Results 


Although looking simple, our protocol 
achieves a very satisfactory security property, 
namely that 


— both key integrity and key privacy are 
preserved under a passive attack, 

— if key privacy is not preserved under an 
active attack then key integrity cannot 
be preserved either. 


The proof of that fact is given below. From a 
practical viewpoint, this means that if an in- 
truder simply eavesdrops what is transmitted 
through the PC, our protocol fully reaches 
the security goals of section 2.2, namely key 
integrity and privacy. Additionally, if the in- 
truder actively operates changes over trans- 
mitted data, she is given no other choice than 


— either knowing the key K, generated by 
the card; but then the issuer recovers 
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Fig. 2. Man-in-the-middle attack on key generation process 


nothing else than a faulty key K} # K7. 
Subsequently, the card just cannot work 
properly because user authentication will 
be unsuccessful each time the end user 
attempts to access the issuer’s service. 
The issuer may then recognize the card 
as a fake or abnormal one and blacklist 
it. 


— or letting the card generate K; properly 
and later have normal access to the is- 
suer’s service; but then, no information 
whatsoever can be obtained on /X;. 


In other words, our protocol prevents in- 
siders from cloning normal cards since only 
useless cards are exposed to key divulgation. 
Trying to gain information on the card’s key 
simply forbids its future use in normal condi- 
tions. Weguarantee this under any type of at- 
tacks, be they very sophisticated. The insider 
is left only with malevolence i.e. the ability 
to force the personalization of useless cards. 
We argue that this scenario is not of inter- 
est to an active adversary. We assess these 
results without considering collusions in the 
first place, and address these further in sec- 
tion 4.5. 


4.2 Security Proof Against Passive 
Insiders 


We state, in a somewhat more formal way: 


Theorem 1 (Passive Attacks). Assume 
that the encryption scheme Epx is determin- 
istic and one-way under chosen plaintext at- 
tacks (OW-CPA). Then no polynomial time 
attacker given PK and c = Epx(r) can recover 
Ky = H(r, PK) with non-negligible probability 
in the random oracle model. 


Proof. We assume the existence of an at- 
tacker A with success probability « and show 
how to invert Epx with probability «’. We 
build a reduction algorithm B as follows. BG is 
given an instance c = Epx(r) and must return 
r with non-negligible probability. B randomly 
selects KK; and runs A(PK, ¢). 


Now, each time A queries the random or- 
acle H for an input (r,pk), 8 checks in the 
history of queries if (r,pk) was queried by A 
in the past, in which case the same answer 
is returned to A. Otherwise, if pk = PK and 
Epx(r) = ¢, then B sets r = r and returns K;. 
If none of these cases occur, 6 selects h uni- 
formly at random, returns # and updates the 
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history of queries. Now when A has finished, 
B checks whether F was initialized during the 
game, simply returns 7 if so or fails otherwise. 
This completes the description of the reduc- 
tion algorithm B. 


Since the simulation of H is perfect, it is 
clear that B is sound. We denote by Ask the 
event that A submits 7 to the simulation of 
H.. Now if Ask never happens, Ky is a uni- 
formly distributed random value unknown to 


A, so 
Pr [A= Ri | ~Ask] < 


where {H denotes the number of elements in 
the output space of H. By assumption, 


e<Pr [4 = Ki] 
< Pr [A = K; | Ask] + Pr[Ask] 
I 
< — + Pr[Ask 
< giz + PriAsk 
which yields 


= Pib—F| 
= Pr [Ask] 
2e sli 


Therefore, if € is non negligible, e' is non neg- 
ligible either. 0 


Interestingly, we also get a slightly dif- 
ferent result for non deterministic encryp- 
tion schemes, i.e. when the protocol relies 
on a probabilistic encryption function r ++ 
Epx(r; u). We include this result here for the 
sake of completeness. We state: 


Theorem 2 (Passive Attacks). Assume 
that the probabilistic encryption scheme Ep, 
is semantically secure under chosen plain- 
text attacks (IND-CPA). Then no polynomial 
time attacker given PK and c = Epx(r) can 
recover IC; H(r,PK) with non-negligible 
probability in the random oracle model. 


Proof. Here again, we assume the existence 
of the same attacker A with non negligi- 
ble success probability « and show how to 
distinguish encryptions Epx with non negli- 
gible advantage «’. The reduction algorithm 
B = (B,, B2) is as follows. B, (the find stage) 
chooses two distinct messages (ro,71) uni- 
formly at random and outputs them. Then 
Ba inputs cp = Epx(rs;u) for a certain bit b 
and random tape u. 6B must guess 6 with non 
negligible advantage. 
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To do this, Ba is designed as follows. Bz 
randomly selects Ky, and runs A(PK, cs). 
Each time A queries the random oracle H for 
an input (r,pk), B2 checks in the history of 
queries if (r,pk) was queried by A in the past, 
in which case the same answer is returned to 
A. Otherwise, if pk = PK and r = r» for 
b € {0,1}, then Bz stops and output b. If none 
of these cases occur, 6 selects h uniformly 
at random, returns h and updates the his- 
tory of queries. If A finishes, B stops, chooses 
8 € {0,1} at random and returns f. This 
completes the description of the reduction al- 
gorithm B. 


The simulation of H is almost perfect. We 
denote by Good the event that A submits rp 
to the simulation of H and by Bad the event 
that A submits rz to the simulation of H. 
Now if neither Good nor Bad ever happens, 
kK, is a uniformly distributed random value 
unknown to A, so 


Pr [A = Ky| -(Good V Bad) < a : 
By assumption, 


e<Pr[A=K;] 
< Pr [4 = Kr | (Good v Bad) 
+ Pr [Good v Bad] 

1 
< — + Pr [Good v Bad] . 
Sif t [Goo ad] 


Since the choice of (ro,r1) is independent 
from A’s view, the probability that rg is sub- 
mitted by A to the random oracle H is upper 
bounded by 1/fir. Given that Good and Bad 
exclude each other, we get 


Pr [Good] = Pr [Good v Bad] — Pr [Bad] 
> Pr[Good v Bad] ~— zy 
tr 


Therefore 


1+é' 





= Pr[B =)] 

= Pr [Good] 

+ Pr[-(Good V Bad) A 6 = 6} 

= Pr[Good] + 5Pr[-(Good V Bad)] 


= 5 + Pr [Good] — 5Pt [Good V Bad] 


Ae al 1 
2 5 + Pr [Good v Bad] — ir 
oe yi by a 
fig? ote ae” 


and finally e’ > € — 1/fH — 2/ir as wanted. 
O 


4.3 Security Proof Against Active 
Insiders 


We now focus on security against active in- 
siders. We have: 


Theorem 3 (Active Attacks). Assume 
the encryption scheme Epx is determinis- 
tic and one-way or probabilistic and seman- 
tically secure (under chosen ciphertezt at- 
tacks). Then obtaining information about Ky 
requires the attacker to corrupt the value of 
PK. Then K; # H(r,PK) with overwhelming 
probability. 


Proof. Essentially, we follow the initial work 
of [6]. Suppose indeed, that the attacker does 
not alter the value of PK which is transmitted 
to the card. Two situations may occur: 


1. either the insider corrupts the value of 
c = Epx(r) by changing it into c’, but 
this of of no use whatsoever to her, 

2. or she does not corrupt c; in this case, 
the insider is passive and theorem 1 or 
2 applies, depending on Epx. This means 
that no information about K; can be ob- 
tained. 


On the other hand, if the insider controls 
the PC and cheats on PK, she may recover 
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KX; by submitting another public key PK’ 
but the issuer then gets a different value 
H(r,PK) -~ H(r,PK’) with overwhelining 
probability. Thus the card will not be func- 
tional and no damage (other than denial of 
service) will incur to the issuer. This provides 
evidence that either the protocol is correct, or 
the card will not function at all. Qo 


4.4 Can Denial of Service Be 


Avoided? 


What is desirable is that the protocol 
would preserve both key integrity and pri- 
vacy under any attack circumstances, as this 
would thwart denial of service attacks dis- 
cussed above. For theoretical reasons, how- 
ever, no protocol can achieve such a better 
security level without an authenticated com- 
munication channel between the card and the 
issuer. The only cheap way to achieve au- 
thentication would consist in masking the is- 
suer’s public key PK into the read only mem- 
ory (ROM) of the card. We would then reach 
both key integrity and privacy in any case. 
But we recall that denial of service does not 
serve the attacker’s interests anyway because 
it precisely testifies the presence of an active 
attack during the personalization process. 


4.5 Collusion attacks 


An intuitive way to break the system would 
be to envision the collusion between a mali- 
cious insider and a malicious issuer. For ex- 
ample, the insider might substitute the gen- 
uine issuer’s public key with the malicious is- 
suer’s public key. In this case, under the un- 
usual assumption that both issuers use the 
same operating system on the card, the per- 
sonalized cards would work on the malicious 
issuer’s network whereas they would not work 
on the genuine network. Although this sce- 
nario theoretically exists, one cannot help 
wondering what benefit the malicious issuer 
could possibly get out of this setting. First, 
the cards are shipped to the initially intended 
recipients or more generally speaking directly 
to the user. Thus the malicious issuer will 
never get hold of the cards. Second, this is- 
suer would then have cards in the field that 
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can and will be used on his own network, 
but he could not plausibly recover any fees 
associated to this usage. So the users would 
simply (say) use wireless communication net- 
works without paying a dime to the malicious 
operator. 


Interestingly, we could also envision attacks 
combining an active intrusion with a par- 
tial or total access to the issuer’s decryp- 
tion server. This would allow the attacker to 
query the server for r-values of her choice 
given c, possibly excepting the ones that cor- 
respond to already listed /{;’s (as this could 
cause some kind of collision detection by the 
server). This is exactly a chosen-ciphertext 
attack scenario and in this case, again, our 
protocol remains fully secure in the same 
sense, provided that the underlying encryp- 
tion scheme Epx be OW-CCA or INC-CCA 
(instead of OW-CPA or IND-CPA). This is 
easily obtained as a natural extension of the- 
orems i and 2. Then chosen-ciphertext se- 
cure encryption schemes like RSA-OAEP [1] 
or Cramer-Shoup [2] must be employed. 


A denial of service attacker can always in- 
teract with the chip-card in such a way that 
in the end the card is invalid. But, as stressed 
before, we assume that this scenario is not of 
interest to an active adversary. We also stress 
the fact that more elaborate attacks where 
the complete set of employees of the manufac- 
turer collude against the issuer are not con- 
sidered in this paper. As an illustration, these 
include situations where the card’s operating 
system itself is flawed or corrupted and does 
not fully respect the protocol. 


In light of the above discussion, we believe 
that no other protocol can further enhance 
the one we propose in this setting, except if 
additional key authentication is implemented 
in some way or an other. 
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5 Practical Examples 


5.1 An Example 
Exponent RSA 


Using Low- 


We recall the protocol steps in this con- 
text, taking SHA-1 as an embodiment of H. 
First, the issuer generates an RSA key pair 
(PK, SK) where PK = (n, e) and SK = (n, d) 
with n = pq for two large primes p and 4g, 
e = 3 for instance and d = e~!_ mod (p — 
1)(q ~ 1) (RSA key generation imposes that 
gcd((p— 1)(q—1), e) = 1). The manufacturer 
is given n and for each card to be personal- 
ized, engages the PC in the following proto- 
col: 


1. the PC transmits n to the card with iden- 
tifier Id, 

2. the card selects r uniformly at random 
and computes K(; = SHA-l(r, n), 

3. the card computes c = r? mod n and 
outputs c, 

4. the PC collects the pair (Id, c) and sends 
it later to the issuer, 

5. the issuer recovers r = c* mod n, com- 
putes K; = SHA-l(r,n) and stores the 
pair (Id, K;). 


d 


Note that this is extremely efficient, as the 
card only performs a couple of modular mul- 
tiplications and a single call to SHA-1. More- 
over, we have the following security state- 
ment. 


Corollary 1 (of theorems 1 and 3). As- 
suming the random oracle model, under the 
RSA assumption, malicious insiders cannot 
retrieve the secret key K; of a functional card. 


5.2 An Example Based on the 
Diffie-Hellman Problem 


It is possible to adapt the above protocol 
in order to use the Decision Diffie-Hellman 
(DDH) as the underlying intractability as- 
sumption. This is done by choosing E]-Gamal 
encryption [3] to instantiate Epx instead of 
RSA, as follows. 


The issuer chooses an abelian group G, de- 
noted multiplicatively, of large order gq, in 
which the discrete logarithm is intractable. 
An elliptic curve defined over a finite field, or 
the group of integers modulo a large prime 
p are examples of such a group. The issuer 
then chooses a base g € G, a random inte- 
ger 1 < x <q, stores SK = x and transmits 
PK = (9,97) := (g,h). The personalization 
process now works as follows: 


1. the PC transmits the issuer’s public-key 
(9, k) to the card with identifier Id, 

2. the card selects r uniformly at random 
and computes the pair (g’, h*), 

3. the — card computes Ky, = 
SHA-1(h",9,h), memorizes Ky; in 
non-volatile memory and outputs 9’, 

4, the PC sends the pair (Id, g”) to the is- 
suer, who later recovers (; by comput- 
ing K; = SHA-1((9")*, 9, A). 


In this case, we get the following security re- 
sult. 


Corollary 2 (of theorems 2 and 3). As- 
suming the random oracle model, under the 
DDH assumption, malicious insiders cannot 
retrieve the secret key K; of a functional card. 


6 Conclusion 


We have presented a simple provably secure 
protocol which enables a smart-card manu- 
facturer to act as a trusted personalization 
center without knowing any secret data be- 
longing to the issuer. The proposed solution 
does not require a public-key infrastructure, 
and avoids all the secret-key management 
procedures usually required to guarantee the 
security of the personalization process. 
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Abstract 


A processor can leak information by dif- 
ferent ways. Although, the possibility of at- 
tacking smart cards by analyzing their power 
consumption [Kocher] or their electromag- 
netic radiations is now commonly accepted 
[Gandolfi]. A lot of publications recognize 
the possibility to recover the signature of 
an instruction in a side channel trace. It 
seems that no article demonstrate how to au- 
tomate reverse engineering of software code, 
using this assumption. Our work describes 
a method to recognize the instructions car- 
ried out by the processor. In a general way, 
a classifier permits to identify the right or 
wrong value during the comparison of a pin 
code or large parts of a software code. Ona 
few micro-controllers, using a classical corre- 
lation between the power trace and a dictio- 
nary, we show how to identify the CPU’s ac- 
tions. Sometimes, silicon manufacturers hide 
specific opcodes deliberately. The EM in- 
vestigation and the template attack demon- 
strated by IBM, at Cryptographic Hardware 
and Embedded Systems 2002, rely on multi- 
variate signal processing for electromagnetic 
and power traces. The method presented in 
this article is based on a self organizing map. 
On a CISC processor, it is then obvious to 
find a hidden instruction looking for a hole or 
a bad construction of the map. The case of 
pipelined processors is a little bit different: as 
they decode, execute, fetch, several parts of 
different opcodes at the same time, it is more 
difficult to recognize a specific signature. 


1 Introduction 


Processor’s power consumption has been 
known for a long time by a restricted group 
of people in the smart card community. For 
security purposes, the knowledge of this side 
channel has been kept secret. In 1998, Paul 
Kocher introduced the concept of Differential 
Power Analysis. It was the first introduction 
of real signal processing for smart card at- 
tacks... Differential Power Analysis (DPA) 
can be explained as the correlation of tworan- 
dom variables. Today, people are confident 
with power analysis, but they also know that 
current measurement is not the only source of 
information leakage. The electromagnetic ra- 
diations can give the same result. Last year, 
the security group of Gemplus demonstrated 
that electromagnetic side channel must be 
taken into account seriously [Gandolfi]. The 
signal noise ratio of electromagnetic analysis 
(EMA) is much better than the signal noise 
ratio of power analysis. EMA measurements 
are very noisy. Power analysis measurements 
contain lower frequencies than EMA. For the 
same processor, the number of traces neces- 
sary to recover the secret key is reduced for 
EMA. The practical implementation of power 
analysis is very simple to realize unlike EMA. 
One of the big advantages of EMA is the 
locality principle. Using a very little sen- 
sor, it is obvious to localize exactly a specific 
leakage source on a processor [Quisquater]. 
Actual improvements of classical non intru- 
sive analysis are linked to sophisticated signal 
processing [Boneh]. For differential analysis, 
Bayesian methods allow to recover the secret 
key with much less than forty traces. 


CARDIS ’02: 5'h Smart Card Research & Advanced Application Conference 


51 


52 


On of the most important parts in differ- 
ential side channel analysis, is the decision 
criteria. A classical countermeasure against 
power or electromagnetic analysis is to de- 
feat the selection function of the attacker. In 
this case wrong guesses appear. An engineer 
very familiar with power trace of a specific 
processor can easily recognize each instruc- 
tion carried out by the device. Thus it is pos- 
sible to build a tool dedicated to parse the 
trace and to gather opcodes during a simple 
acquisition. A neural network can improve 
the decision criteria. With such a tool, once 
the learning phase done, it must be easy to 
recover instructions. 


Two instructions executed on a Complex 
Instruction Set Computer processor can ne- 
cessitate a different number of clock cycles. 
Generally a Reduced Instruction Set Com- 
puter processor execute one instruction per 
clock cycle. The number of clock cycles per 
instruction is a source of information for an 
attacker. Asynchronous processors do not 
have clock, so their actions are not easily 
identifiable. Anyway, it is possible to recover 
very specific patterns such as some memory 
access (charge pump) parsing power or EM 
traces. For a classical processor, the role of 
the clock is dominating, and the principal 
component of consumption remains is linked 
to the clock signal. 


2 Chip depackaging 


For a long time, smart cards have been fa- 
mous for their tamper resistance [Anderson]. 
They are protected against invasive attacks. 
Security sensors are various, so a lot of at- 
tacks can be stopped very early. In or- 
der to discourage attackers, it is quite dif- 
ficult to open the package and directly ac- 
cess the microprocessor without damaging it. 
Many traps have been introduced by design- 
ers. A lot of processors are surrounded by 
grids, physical parameters are checked auto- 
matically. The continuity or the resistance 
of these sensors indicates a correct operation 
mode to the processor. But the engineers do 
not only use passive sensors, they also intro- 
duce false radiation sources, clock jitter, low 





Figure 1: A depackaged processor 


temperature and light detectors... 


Recent attacks permit to cancel some sen- 
sors. With very concentrated nitric acid and 
organic solvent it seems to be possible to de- 
package a smart card or a classical processor 
without damaging it. When the chip is de- 
packaged it is very easy to see regular struc- 
tures or external sensors, sometimes without 
a specific microscope. Inside a structure, the 
logical gates are mixed in order to complicate 
the spot of a person wishing to do reverse en- 
gineering. 


On figure 1, a classical processor has been 
opened. It is based on an eight bit architec- 
ture including a pipeline. All the instructions 
are done into four or eight clock cycles. On 
the left side, two identical banks are visible, 
and on the other side non regular structures 
are present. This pipelined processor is well 
known from the pay-TV pirates. It is cur- 
rently extremely used with an J?C’ memory 
for false smart cards. In general, test circuits 
of classical smart card are specific, it is not 
the case here. 


3 Power measurements vs elec- 
tromagnetic measurements 


The electromagnetic radiations have re- 
cently been investigated as one of the new 
sources of side channel for the smart cards. 
The magnetic field is much more investigated 
than the electric one. The sensor is very often 
a coil, placed in the close field of the proces- 
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sor. 


If we separate the spectrum into two princi- 
pal components, each one dominating largely 
over a frequency band, the electric field from 
DC to 10 MHz carries information different 
from the magnetic field. In fact the propa- 
gated wave is not the same at all. A small 
capacitor or a wire simulating an antenna is 
enough to investigate seriously effects of the 
electric field. It allows to locate more pre- 
cisely some parts of the chip (phase locked 
loop, charge pumps, ). Inspecting this band 
also permits to recover the presence of the 
clock signals, often lower than 10 MHz. Some 
smart card integrate their own internal clocks 
or frequency multipliers. 


The traditional power consumption anal- 
ysis [Messerges] recovers the actions of the 
processor but does not permit to map a chip. 
The global consumption measurement is the 
sum of local consumption of each local sub- 
structures of the smart card. It may be possi- 
ble with some signal processing to isolate the 
consumption from each component. We wish 
to retrieve the code executed by the proces- 
sor. In order to be able to isolate the circuitry 
concerning instructions, we use electromag- 
netic radiation but particularly the electric 
field. 


4 The pipe-line and the influ- 
ence of each instruction 


The processor we want to analyze contains 
a four-stage pipeline. Four clock cycles are 
necessary to the processor to carry out an in- 
struction. But at each clock cycle it is in fact 
a fetch, a decode, an execution, a storage. 
The influence of the pipeline is then present 
on the side channel trace. And in an extended 
way, the preceding instruction modifies the 
consumption of the instruction in course of 
execution. 


In order to highlight such an assertion, we 
have an instruction (A) executed by the pro- 
cessor followed by several identical instruc- 
tions (Bi). This reveals that the first instruc- 
tion (Bl) after the instruction (A) has its 





Figure 2: The interaction between two in- 
structions 


trace modified by the instruction (A). Wecan 
realize here that the first quarter of the trace 
is stronger if the instruction before was a Bit 
Clear in File register (BCF) with a pad set to 
one. 


Instructions can have interaction with 
themselves. In order to validate our work- 
ing hypothesis, we execute several identical 
instructions (Bi). Normally, all the instruc- 
tions should have an identical trace. In fact 
it is not the case and the ”action” of the first 
instruction modifies the trace of the second 
instruction. It is exactly what the figure 2 
demonstrates. 


The signature of each instruction contains 
four peaks. On figure 2, the first peak of 
the BCF is directly linked to the instruction 
just before, but not only. When BCF follows 
another BCF having called upon the estab- 
lishment of an external pad, to one using a 
particular register, the first peak is different. 
Between the BCF 0 and the BCF 1, the dif- 
ference is visible, and is only related to the 
external action of the preceding instruction. 


5 An instruction signature 


Each instruction gives a different trace by 
power analysis [Fahn]. But the signature of 
an instruction is an expression of its own 
address in memory, the data handled and 
sometimes the address where the data will 
be stored. The Hamming weight of each 
data/adress is clearly visible by electromag- 
netic analysis, but it is still impossible to de- 
tect 55 to a AA as their Hamming weight is 
the same. Using the electric field, it is pos- 
sible to recover more information. Our an- 
tenna is based on a bonding wire, so if the 
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Figure 3: Two signatures with two addresses 
and two Hamming weight 


architecture of the bus does not equilibrate 
the electric field radiated, it is possible be- 
ing closer to a group of wire to identify their 
activity. A possible countermeasure against 
such an analysis is to use two wires for cod- 
ing 0 and 1. It is called dual logic, two other 
states are free to specify alarms. (01 = 0, 
10 = 1, 00 = 11 = alarm). Ross Anderson, 
Markus Kuhn and Sergei Skorobogatov, sug- 
gest to use this architecture to design new 
processors. If a pair of wire is set to 00 or 11 
it products a detectable fault. 


The influence of an instruction’s address on 
its signature can be easily showed. The same 
couple of instructions executed several times 
at different addresses does not permit to ob- 
tain a constant trace. The Hamming weight 
involved is very different. The figure 3 shows 
the influence of the address on the first and 
last cycles of the clock. This figure repre- 
sents two instructions CLear Work Register 
(CLRW) at two addresses with an extremely 
different Hamming weight. The first modi- 
fication corresponds with the address execu- 
tion of the current instruction, whereas the 
last peak is the expression of the address of 
the next instruction to be executed by the 
pipeline. 


Moreover, consumption traces clearly in- 
dicate that the consecutive addresses some- 
times contain an identical Hamming weight. 
Figure 4 details the trace of consumption of 
the instruction SUBstract Literal from Work 
register (SUBLW) for nine continuous ad- 
dresses. The first clock cycle is influenced by 
the address of the instruction executed. Two 
consecutive address with the same Hamming 
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Figure 5: The influence of data’s Hamming 
weight 


weight generate two identical power peaks. 


The Hamming weight of the manipulated 
data is important too [Kommerling]. On our 
processor, the third peak expresses this data. 
Figure 5 proves that quite a linear equation 
between the Hamming weight and the con- 
sumption exists. Of course the processor 
under analysis does not contain any power 
or electromagnetic analysis countermeasures. 
Advanced smart card processors are gener- 
ally protected against non intrusive analysis 
[Coron]. In some processors the data bus is 
encrypted, (sometimes the address bus too). 
If an attacker is able to recognize patterns 
using the Hamming weight, he can be able 
to extract cryptographic keys [Kuhn]. Then 
to avoid such an analysis, the implementa- 
tion cryptographic algorithm are well secured 
against power or electromagnetic analysis. 


Transfering address modifies the shape and 
the very last peak of an instruction’s power 
trace. The linear properties are maintained 
compare to the address properties. The main 
difference between the influence of the Ham- 
ming weight on address and data is the local- 
ization in the signature. 
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Figure 6: The sleep instruction 


6 A correlation attack 


The processor we decided to investigate can 
be described with only one template for all 
its instructions. But some processors do not 
have instructions with quite the same sig- 
nature. In a few cases, the power traces 
([Biham] are very different for each instruction 
compared to the others, and their automatic 
recognition is common place. 


In our case, for the processor studied, each 
instruction has a signature on a four clock 
cycles. On the other hand, the values of the 
various peaks are directly related to the op- 
code. It is the case of figure 6. The sleep in- 
struction gives a characteristic signature just 
before the deactivation of the processor. 


It is then possible to carry out a recogni- 
tion of the instructions using a simple corre- 
lation with a dictionary . Building a dictio- 
nary where each instruction is represented by 
tn points of measurement is not so difficult. 
Each power or EM trace is saved in the dic- 
tionary as S,, points. By multiplying each t¢; 
point with the S; corresponding value, and 
by normalizing the result, we obtain a spe- 
cific opcode detector. Standardization is not 
necessary because if measurements are car- 
ried out in the same environment as during 
the creation of the dictionary, the voltages 
and the measured currents are the same too. 
The only requirement is to have a compara- 
tor triggered on a high threshold, in order to 
secure the false alarms. 


The calculation of correlation avoids the 
permanent resynchronization. To isolate the 
known patterns from a consumption trace, 
it is necessary to be synchronous with the 
trace [Kelsey]. Our experiments for the mo- 
ment are not executed in real time, and our 
traces are stored over for a long time by us- 
ing a memory board of 1 Gigabytes. Then 


the values go trough a correlator which pro- 
vides after a comparison/threshold the name 
of the instruction identified with the Ham- 
ming weight concerned. This method works 
with a high level recognition (better than 
87%) better for CISC processor than for RISC 
ones. Anyway we tested it on a Z80 processor 
and we managed to recover more than 95% of 
a software. Parsing the power and EM trace, 
we were able to extract each pattern and to 
identify it, using the dictionary built just be- 
fore. 


Ideally we wish to be able to find the codes 
carried out on RISC processors, but as many 
instructions have very close signatures, the 
error rate becomes important. 


7 Self organizing maps 


It is then necessary to change the structure 
we used. We have decided to use an auto- 
matic classifier. This work can be done by a 
neural network. 


The Kohonen’s self organizing maps are 
based on a network of K neurons with N in- 
put. The network has K outputs. The inputs 
are vectors with NV components, all connected 
completely to K neurons of the network by 
NK modifiable connections. The neurons of 
the network are placed in a two dimensional 
space. Each neuron has neighbors in this 
space. Each neuron has lateral connections 
according to the core of convolution of the 
Mexican hat operating on its neighbors. 


It is supposed that the weights of modifi- 
able connections are initially random. For an 
input X vector the output of the network is 
ay vector with K components: 


ye = DOF Wag X= XT Wi 


In this equation W; is the vector weight of 
neuron i, i.e. the vector with N components 
W,,;. According to the vector X and the ini- 
tial configuration of the weights, there is a 
neuron %g whose output is the largest. This 
neuron is considered as the winner. 
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The activity of the neurons close to the 
neuron io is facilitated, while that of distant 
neurons is inhibited. After balance, the out- 
put of the network reveals a zone of dominat- 
ing activity around the winning neuron, sur- 
rounded by inactive zones, or slightly active. 
Modifiable connections are then adjusted ac- 
cording to the rule: 


AW; = ay; (X = W;) 


The highest values of y; can be found in 
the most active zone, i.e. around the neuron 
io, the corrections are important while they 
are very weak in the slightly active zones and 
null in the inactive zones. Qualitatively, the 
synaptic correction tends to make the neuron 
zo more selective to the X data. Indeed, for 
the neuron zg , the output is more important, 
one checks before the correction: 


Vi # ig, yo = X7 Wig > ys = XTW; 


After the correction, replacing W; ==> W, + 
AW, we obtain: 


yi = XT(Wi+ OW)) 
= XT IW, + aXTW,(X = W,)| 
= X7™W,(1+aX7(X —W,) 


If the scalar product X7(X — W,) is posi- 
tive, after the correction of the weight, the 
selectivity of the neuron zo to the vector X is 
increased. The neurons linked to lateral con- 
nections, are also strongly concerned by the 
modification of the weights [Kohonen]]. 


8 The protocol 


A neural network learns the signature 
(power consumption and_ electromagnetic 
analysis) of an instruction, and then recog- 
nize it later automatically. 


We have to store hundreds traces for each 
instructions for a processor. So to be able to 
identify the instruction, we first set a pad to 
one, to change the power consumption, and 
then execute the instruction. To reduce the 
influence of the preceding instruction, we in- 
sert a nop” instruction before the instruction 


we want to analyze. We used this instruc- 
tion because we noticed the influence of the 
nop” was none on the instruction just after. 
We keep the trace of the instruction (signa- 
ture) and then repeat the procedure with the 
electromagnetic field. 


So, at the end, we have hundreds signa- 
tures for each instruction. Some instructions 
are more complex than others (one or two 
parameters). We store several hundreds sig- 
natures. We just change one parameter (ad- 
dress of the instruction, data manipulated or 
address of the data manipulated) to be able 
to fix a large part of the signature. Once the 
total set of signatures per instruction is pre- 
sented to the neural network, it is possible to 
class them. Each signature defines a zone in 
a space with its parameters, and the neural 
network determines the centroid of this area. 


Then when you present the power signa- 
ture or the electromagnetic signature of an 
instruction to the neural network, it is able 
to recognize it and to give you the class of 
the instruction. 


9 A practical case: reversing a 
code 


As we did for the correlation, we built two 
networks, one for the four clock cycles in- 
structions and the second for the rest. Using 
same detector as before, with in input the 
shape of an average signature obtained for all 
35 instructions, it is possible to isolate the 
patterns to be presented to the neural net- 
work. In the case of our processor, we suc- 
ceeded in obtaining a reverse engineering of 
the code and the Hamming weights concerned 
in 93% of the cases. Figure 7 represents the 
measured trace with the values obtained at 
output of the neural network. 


The same technique can be used to attack 
pin codes. Once a network has learned traces 
of wrong pin code comparison it is able to 
characterize the difference between the pro- 
posed pin code and the right one. We man- 
aged to defeat a pin code comparison on a 
old GSM phone card. It is also very interest- 
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Figure 7: A power trace and the comments 
after code recognition 


ing to notice that some PIN code comparison 
are still sensible to the timing attack. As the 
bytes are compared in a given order, and not 
a random one, it is possible to recover watch- 
ing the answer time the PIN code. Of course, 
a very simple countermeasure is to random- 
ize the comparison or to set a constant time 
before to answer. 


In some smart cards, the processor man- 
ufacturer has hidden instructions to obtain 
a specific task. It is well known that 
some governmental agencies asked Digital to 
add/remove some instructions to the Alpha 
processor (Hamming weight). It is very dif- 
ficult to recover these instructions. So with 
our method, when instructions appear in the 
power or EM traces, if they were never pre- 
sented to the network before, the correlation 
between the form indicated by the network 
[Kohonen2] and the treated data shows that 
the network never met such an instruction in 
the past. The probability of a correct guess 
is low for all the outputs. 


10 Work in progress 


In the future we will focus on the sig- 
nal processing part. Actually the acquisition 
chain (acquisition at 125 MHz 12 bits resolu- 
tion) is good enough and we have to investi- 
gate new neural network to improve the selec- 
tion of each instruction. We decided to start 
with neural networks based on the K-Nearest 
Neighbors. Anyway, it does not seem to be 
the unique solution and a Multi Layer Percep- 
tron sounds quite nice too. Of course a QV 
based on Voronoi diagram can give results. 
But we have to test and select new criteria for 
the network (Manhattan distance). It may 
be possible to explain completely the influ- 


ence of the pipeline using a source separator. 
The representation of the data is very impor- 
tant too, we have to avoid synchronization 
problems using a wavelet transform or modi- 
fying the architecture. We're actually testing 
a commercial crypto-processor with our net- 
work. 


11 Conclusion 


This article presents a use of traditional 
techniques of correlation and SOM to find the 
instructions executed by a very simple pro- 
cessor with only 35 instructions. In order to 
improve the signal noise ratio of the treated 
data, the electric field makes it possible under 
certain conditions to find the exact values of 
the handled data. Obviously it looks more 
like an academic case, than a directly us- 
able attack. Indeed the processors for smart 
cards contain countermeasures, slowing down 
or preventing such attacks. However it is im- 
portant to notice that the attacks on side 
channels will go on increasing, because the 
possibilities to correlate the data are multi- 
ple, and signal processing makes it possible 
to increase the effectiveness of the attacks. 
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Abstract. In smartcard encryption and signature 
applications, randomised algorithms are used to 
increase tamper resistance against attacks based on side 
channel leakage. Recently several such algorithms 
have appeared which are suitable for RSA 
exponentiation and/or ECC point multiplication. We 
show that under certain apparently reasonable 
hypotheses about the countermeasures in place and the 
attacker’s monitoring equipment, repeated use of the 
same secret key with the algorithm of Liardet and 
Smart is insecure against any side channel which leaks 
enough data to differentiate between the adds and 
doubles in a single scalar multiplication. Thus the 
scalar needs to be blinded in the standard way, or some 
other suitable counter-measures employed, if the 
algorithm is to be used safely in sucha context. 


Key words: m-ary exponentiation, Liardet-Smart 
randomized algorithm, ECC, addition chains, sliding 
windows, addition-subtraction chains, power analysis, 
SPA, SEMA, blinding, smartcard. 


1 Introduction 


Major progress in the theory and practice of side 
channel attacks [5, 6] on embedded cryptographic 
systems threatens to enable the capture of secret keys 
from single applications of cryptographic functions 
{10, 11, 14]. This is particularly true for the more 
computationally intensive functions such as 
exponentiation, which is a major process in many 
crypto-systems such as RSA, ECC and Diffie-Hellman. 


Timing attacks on modular multiplication can usually 
be avoided easily by removing data-dependent 
conditional statements [16], but, with timing variations 
removed, attacks which make use of data-dependent 
variation in power and electro-magnetic radiation 
become easier. Initial attacks of this type required 
averaging over a number of exponentiations [8]. 


One counter-measure is to modify the exponent from e 
to e+rg where r is a random number, typically 32-bits, 
and g is the order of the (multiplicative) group in which 
the exponentiation is perforined [5]. This results in a 
different exponentiation being perfiormed every time. 
However, if squares and multiplications can be 
distinguished during a single exponentiation, then use 
of the standard binary exponentiation algorithm 
immediately leads to exposure of the secret key. 


For elliptic curve cryptography (ECC), the most 
efficient schemes for point addition and point doubling 
involve different numbers of operations in the field over 
which the curve is defined, and these numbers vary 
depending on the representation used for the curve. A 
counter-measure which reduces the likelihood of 
distinguishing between these point operations involves 
equalising the number and type of the component field 
operations [12] or making the point addition look 
exactly the same as two point doublings [1]. 


However, squares and multiplications in the field 
behave diffierently [13] and so there is no reason to 
believe that such recoding will necessarily hide fully 
the distinction between point additions and point 
doublings: for example, in [12], field squares appear for 
point additions, but field cubes when the same fonnula 
is used for point doublings. Side channels can 
distinguish these if the Hamming weight of arguments 
can be deduced. So exponentiation algorithms are 
chosen in which there is still an ambiguity in the 
correspondence between multiplications (i.e. point 
additions in ECC terms) and properties of the secret key 
(such as bit or digit values). Ad-ary exponentiation [4] 
for m > 2 provides one solution because each addition 
represents an unknown choice from a set of several 
non-zero digits. 


If the same unblinded key value is re-used for many 
exponentiations, there is a danger that the repeated use 
of the same operand can be detemnined [14]. This 
would enable individual digits of the exponent base m 
to be identified and hence the key recovered. 
Unfortunately, particularly for ECC as opposed to RSA, 
applying the above exponent blinding technique is 
expensive when the secret key is typically only 192 
bits. It adds about 17% to the cost of point 
multiplication. | Hence randomised exponentiation 
algorithms may be a preferred option for ECC. 


There are currently several algorithms which candomise 
the operations associated with specific inputs so that the 
exponentiation scheme is different on successive runs 
with the same data [7, 9, 17, 2, 3]. That of Liardet and 
Smart [7] uses a sliding window of random, variable 
width. {f the attacker’s equipment is insufficient to 





USENIX Association 


CARDIS ‘02: 5 Smart Card Research & Advanced Application Conference 


59 


60 


obtain information from a_ single EC _ point 
multiplication, then it seems that averaging over 
different multiplications with the same key would dilute 
any data dependency in the side channel leakage. 
However, we will show here that if individual point 
multiplications do leak information about what 
operation is being performed, then the secret key can be 
obtained straightforwardly. Indeed, one might even be 
better off with m-ary multiplication. 


We begin by recalling the algorithm and looking at 
various parameters which might be chosen to improve 
efficiency or security. Next, the assumptions about the 
attacker’s equipment and cryptosystem counter- 
measures are outlined. These are initially quite tight to 
make the presentation of the attack easier. The attack 
starts with extracting a least significant digit, and then 
uses this repeatedly to reconstruct one possible 
representation for the secret key. An essential part of 
the discussion is an assessment of the probability 
that the attack can be completed successfully. 
Before concluding with some counter-measures and 
alternatives, we explain how the attack can still be 
performed in a more realistic environment where the 
side channel leakage is much poorer. 


2 The Algorithm 


This section contains a brief outline of the 
(exponentiation) algorithm of Liardet and Smart. 
Because it generates an addition-subtraction chain 
rather than simply an addition chain, inverses have to 
be computed when it is applied. This means that 
applications to RSA cryptography are unlikely because 
of the expense of computing inverses. However, in 
elliptic curve ciyptography (ECC), inverses are 
essentially for free. Hence, we will assume the 
algorithm is applied to an additive group, such as that 
formed by the points on an elliptic curve, and use 
appropriate terminology. Processing of the secret key k 
therefore produces a sequence of instructions which 
result in additions (A) and doublings (D) of group 
elements. 


Suppose we wish to compute the element O = kP fora 
given positive integer k (the secret key) and a given 
member P of some group E. As in m-aly 
exponentiation, Liardet and Smart pre-compute the odd 
multiples iP of P for integers ¢ € (—4m, ‘4m] where 
m = 2", and then employ the standard sliding windows 
technique but with a window which has a random width 
showing up to R bits. In other words, k is recoded to 
obtain digits k; (0 <i <7) which are deterinined using a 
randomly-chosen variable base m, which divides #7. 
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The digits are chosen in the order ko, ky, ..., k, and the 
digit representation k,4,,-)...44ko satisfies 


R= (C..((kn) ttn ath, 1)imy-2t...)Im +h otk (1) 


The group element multiplication processes these digits 
from most to least significant following the related 
scheme defined by 


KP = mo(an(...t20t ne a(KnP)tkpaP)+...)tkiP)tkoP (2) 


2.1 Code for the Key Recoding 


More explicitly, if minmod is the function which returns 
a residue of minimal absolute value, the algorithm for 
choosing the digits is this: 


RANDOMISED SIGNED m-ARY WINDOW DECOMPOSITION 
(7]: 
ixO-; 
While k > 0 do 
{  If(kmod 2) =0 then 
{ m2; 
k —0; 


else 
{ Randomly choose base m; € {2',2°...., 2°}; 
ki — k minmod m;; 


} 
ke (k-ki)/mi . 
i i+1; 


}. 


Here, both 1 and 1 could be allowed as digits for base 
1, but that involves the added complication of a random 
bit to decide which to select, and also (to avoid non- 
termination) restricting the choice to only 1 when k 
reaches 1. Our attack would work also in these 
circumstances with few changes. 


2.2 Efficiency Considerations 


There are still some parameters to be chosen in the 
algorithm. Varying these affects efficiency, but there 
are also security implications. As we see later, certain 
choices will increase the difficulty of mounting the 
attack, forcing, in particular, more samples to be 
required. 


The value for R has the greatest effect on efficiency. In 
elliptic curve applications, subtraction may have the 
same cost as addition. Then it will be unnecessary to 
store the negative pre-computed multiples of the input 
point. So only space for 2-2 multiples is likely to be 
required. Increasing R improves speed, but with 
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diminishing retums for the space required for pre- 
computed values. 


No suggestions are made in [7] about how to choose the 
m, Yandomly. A uniform distribution is not very 
efficient, and indeed perhaps the least secure under this 
attack. It is most efficient to make the maximum 
possible use of the pre-computed values by choosing 
the maximum base size 2° always. But, to maintain 
generality for later, suppose m,= 2 is chosen with 
probability p; when k is odd and po = 1 is the probability 
of selecting base 2 when k is even. 


Choosing pr = 1 means that m; = 2" whenever k is odd. 
This yields the usual m-ary sliding window method 
with fixed m = 2". Taking R = 1 yields the usual binary 
“square-and-multiply” algorithm. | However, such 
choices would remove any non-determinism from the 
sequence of point operations. 


Observe that biasing in the choice of mj, does not 
change the uniformity in the distribution of residues 
k mod mi, inherited from k, assuming k is randomly 
chosen. This means that every new key value k 
generated during the recoding retains the same random 
properties: in particular, residues modulo 2 will be 
unifiorm for every key encountered. 


3 The Attack 


3.1 Introduction & Initial Hypotheses 


The purpose of randomised exponentiation algorithms 
is to frustrate side channel analysis by an attacker. In 
particular, they are counter-measures against using 
knowledge of the exponentiation process to extract the 
secret key k. Several diffierent levels of leakage are 
possible, depending on the resources of the attacker, A 
poor signal-to-noise ratio (SNR) means that many 
samples have to be taken, and averaging the side 
channel leakage is one way of improving the SNR. So 
a critical parameter is whether or not the attacker’s 
equipment is good enough for him to extract sufficient 
meaningful data from the side channel trace of a single 
scalar multiplication. If it is, then the standard key 
blinding described earlier suddenly fails to provide the 
data hiding protection afforded by averaging away local 
data dependencies. Improved equipment and laboratory 
techniques mean that this barrier might soon be 
breached without excessive expenditure [10, 11]. 


The categories of leakage which could be considered 
include the following: 


i) individual point operations can be observed on 
power, EM or other side channel traces; 
ii) point doublings and point additions can be 
distinguished from each other; 
iii) re-use of operands can be observed; and 
iv) operand addresses can be deduced. 
Point (i) may hold simply because program instructions 
and data need to be fetched at the start of each point 
operation, and these cause different effects on the side 
channels than field operations. Point (ii) may then hold 
as a result of different patterns of field operations for 
point additions than for point multiplications. 
Properties (iii) and (iv) might hold as a result of being 
able to deduce Hamming weights of data and address 
words travelling along the bus. 


Randomisation prevents the obvious averaging of the 
traces of many point multiplications which was used in 
initial power analysis attacks on the binary “square- 
and-multiply” algorithm. Here every point multiplic- 
ation determines a different sequence of doublings and 
additions. With matched code for additions and 
doublings, averaging may hide the difference between 
the two operations because they are no longer separated 
in time, but in current implementations such averaging 
will certainly reveal the start and end of the individual 
point operations which make up the scalar 
multiplication. 


The attack described here requires the SNR to be good 
enough to extract some useful data from single 
multiplications on the curve. Specifically, initially we 
assume that 


e Adds and doublings can always be identified 
correctly and distinguished from each other using 
traces obtained from side channel leakage for a 
single point multiplication, and 


e A number of traces are available corresponding to 
the same secret key value applied to different 
scalar multiplications. 


Both of these hypotheses will be relaxed later to some 
extent, providing a more realistic scenario. 


3.2 Overview of the Attack & Notation 
The outline of the attack is as follows. For simplicity, 
by the first hypothesis, 


e Every trace can be viewed as a word over the 
alphabet {A,D}. 


Every occurrence of an A (add) in the trace splits the 
word into a prefix and a suffix which correspond to two 
integers £, and &, that are precisely defined in terns of k 
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and the position of A in the trace. All traces deterinine 
the same values to within +1. By looking at the 
patterns to the lefi of a given A, one obtains the residue 
of k, modulo a small power of 2, and hence a few bits 
of k. Repeating this for the position of each A enables 
all the bits of k to be recovered. 


Definition. The position of a specific instance of 
character A or D in a trace word is the number of Ds 
which are to the right of the selected character. 


We will exploit a close relationship between positions 
in which A appears in traces and bits which are | in the 
corresponding position of the binary representation of k. 


In order to be able to give examples, we fix the 
character order of words over {A,D} to correspond to a 
left-to-right processing order. Thus, the digit sequence 
1,021,404, with most significant bit on the left, is 
processed from most to least significant bit, i.e, from 
left to mght, and so would result in the word 
DADDDADD. There are As in positions 2 and 5, and 
Ds in positions 0 to 5. In fact, every A is paired with a 
preceding D with the same position, and so one could 
view the DA combination as a single character. Then 
the position would correspond directly to a character 
index, counting from 0 at the right hand end, as in a 
binary representation. 


The initial DA corresponds to the digit 1, and might be 
omitted if efficient initialisation takes place instead. 
Assuming this is the case, we will delete any initial Ds 
but leave the initial A as an unambiguous reminder that 
there is an initial digit to take into account. Thus words 
always commence with A. In the above example, 
ADDDADD will be the word corresponding to the 
given digit sequence. 


3.3 Properties of Key Digits 


We now look at the sequence of digits generated by the 
Liardet-Smart algorithm. The notation used here is the 
same as fora fixed base, and many standard properties 
have analogues. 


Lemma 1. Suppose k,fp-\...k\ko is a digit represent- 
ation of k generated by the Liardet-Smart algorithm, 
with sequence M,, My), ..., M%,, Mo Of bases. For some i, 
let k, denote the integer corresponding to the prefix 
K.&n1...&; and let k,® denote the integer corresponding 
to the suffix k;.1...kjk. Then k = k,°m+k© where 


@ = 
m “ 


m, and |k.?|<m”. 
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This lemma is obvious from the definition of the digit 
sequence given by equation (1) except, perhaps, for the 
last part. That part follows easily by induction. Digit k; 
is chosen with |k < Ym; < m,-1. This property for 
i=0 starts the induction. Then the induction step is 

(x, = [kan +k, < (+1) | < mm = nyt) 


We will continue to use the notation m® for []))™, 


and k,” and k,® for the key values associated respect- 
ively with the prefix k,k,-;...k; and the suffix k;;...kiko 
of the digit sequence. The next lemma uses the equality 
and bound of Lemma | to identify two possible values 
for k,”, corresponding to it being a positive or negative 
residue of k modulo m“: 


Lemma 2. With the previous notation, either 

i) k =k mod m and kj, =k div m™, or 

ii) k, = (k mod m)-m and k,® = (k div m)+1 
where mod returns the least non-negative residue, and 
div is integer division given by rounding down the real 
quotient. 


This shows that, whatever choices are made for the base 
elements, a given digit suffix can determine only one of 
two possible values when the product of the 
corresponding base elements is fixed. We would like at 
least the occurrence of the one which will make the 
corresponding prefix odd because it leads to an addition 
(A) which can be used to identify a corresponding point 
in the trace. 


Lemma 3. For all powers 2’ <k, there is a choice of 


base elements m, and an integer i such that 2 = m". 


On average, for at least 1-2* of all values of j, there 
are choices which make kt? odd, and, for at least half 
of all values of j, there are choices which make ne 
even. 


Proof. The existence of the choice of basis elements is 
clear: taking m;-= 2 for all i’ allows one to satisfy the 
equality for m® with i = j. For that choice, x; is the 
usual index i bit of k, and takes either parity with equal 
probability. It is the lowest bit of k,”, and so k,” is the 
desired parity with probability /. 


Other choices of base elements exist, and they may 
result ink, being odd even when the bit of interest in k 
is even. This increases the average number of cases for 
whichoddness occurs. Instead of choosing m,_; = 2, we 
try choosing m)-; = 2" for any i’ with | <i'<R. The 
ability to select them depends on a corresponding bit of 
ky” ® being 1 (otherwise base 2 must be chosen). 
These bits are independent and so the alternative bases 
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can be chosen with probability 4% Each will give odd 
parity to the next k, if chosen. Hence the prefix key 
corresponding to 2’ can be made odd with probability at 


least 1-2-*. a 


Of course, this argument just gives a lower bound on 
how many js will give rise to two key prefix and two 
suffix values. It doesn’t guarantee that when both 
values are possible they will both appear with non- 
negligible frequencies. The actual relative frequencies 
appear to depend on the lowest R bits of the k, 
corresponding to m” = 2°% However, in the next 
section, a lower bound on the ratio will be produced as 
necessary for eachchoice of these bits. 


3.4 Recovering One Digit of & 


In this section we show how to recover the least 
significant digit ko and associated base mio in one 
representation of k and how to identify the subset of 
traces which correspond to the associated prefix key k, 
such that k = k,to+Kko. Exactly the same process yields 
other digits of k independently. Those digits can then 
be assembled together to give & in the manner described 
in the next section. 


If Tr is the full set of all sample traces, then we denote 
by Tr; the set of traces obtained by taking each member 
of Tr and deleting the suffix to the right of, but not 
including, the D of position i. Thus Tro = Tr. Tr; is 
partitioned into two complementary subsets: Tr/* which 
consists of those traces which terminate with A, and 
Tr? which consists of those traces which terminate with 
D. We need to identify one of these subsets for each 
digit choice so that its neighbour to the left can be 
selected correctly. rj‘ always represents the odd 
choice for k,“, but some traces in Tr? may contain only 
some of the operations for the rightmost prefix digit, 
and so not represent any k,” properly. 


The derivation here does make specific use of the fact 
that in this implementation 1 is not allowed as a digit 
for base 2. Similar arguments apply when 1 is allowed, 
but there is a duality which leaves a complicating 
ambiguity between the two values of +, throughout the 
reconstruction process. This is only resolved when the 
complete value of & is reconstructed and under the 
assumption that the truc sign of & is known. 


Lemma 4. Select any trace for keyk. Then k is exactly 
divisible by 2' where i is the uniquely defined integer 
such that AD' is a suffix of the trace. 


Proof. Clearly, if & is divisible by 2' then base 2 must 
be chosen for the lowest i digits, which are then all 
zeros. This leads to a character sequence D' of : 
consecutive Ds as a suffix in every trace. If x is not 
divisible by 2’ then, whatever the next choice of base, 
the digit will be non-zero and hence cause A to be 
appended to the sequence, yielding the suffix AD’ 


This result enables these i occurrences of D to be 
identified with / least significant digits 0, each of base 
2. Moreover, all traces confirm this conclusion. So, 
removing the digits one at a time, 


Lemma 5. /f every trace in Tr has final character D 
then we may take ky = 0, mtg = 2 and the traces of Tr, all 
represent the associated ky. 


If k is odd, no digit has been deduced yet, and further 
work must be done. 


Lemma 6. Suppose k = 1 mod 2' where i < R. Then 
k = 2'+1 mod 2"! if Tr contains a trace with suffix 
AD'A. If k =2'+1 mod 2" then the probability that Tr 
contains no trace with suffix AD'A is (1— p;)"" where 


Pi = pytprt...+pi. 


Proof. If k = 1 mod 2" then a base of mio = 2'*' or 
larger will lead to suffix D”'A. However, a smaller 
base io = 2’ will lead to suffix D/A with digit ko = 1 and 
the forced selection of base 2 at least i+1—j times. This 
again leads to suffix D*'A. 

Now suppose k = 2'+1 mod 2". A base of mig = 2"! or 
larger again leads to suffix D‘*'4. However, the choice 
of base #9 = 2' means lowest digit ko = 1 and next digit 
determined by & div 2', which is odd. Hence the suffix 
is AD'4 for that choice. Similarly, a base ip = 2 with 
j <i, will lead to suffix DA, digit ko = 1 and k, = 2'7 
mod 2‘*! So this choice is followed by the “forced 
selection of base 2 exactly i+j times with associated 
digit 0. The subsequent digit is then odd, resulting in 
the overall suffix AD‘A. 


Thus, suffix AD'A guarantees k = 2'+] mod 2"! and it 
occurs precisely when the least significant base choice 
is 2 with ; <i. These choices occur for a given trace 
with probability p, = p,+p2+...tp;. Hence suffix AD'A 
will not happen for any trace in 7r with probability 


(lpi), a 


Lemma 7. Suppose k = 1 mod 2'. If Tr contains a 
trace with suffix AD'A then k = 2' ‘+1 mod 2"', we may 
take ky = 1 and me = 2', and the traces of Tr? all 
represent the associated kp. 
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This lemma deals with the recognisable instances of 
k =2'+1 mod 2’*'. When base 2’ is chosen for any j <i, 
the suffix is AD'A for these cases. As this occurs in p;’ 
of cases, so we expect 7r7 to contain approximately 
p:'|Tr| elements. 


We will assume k = 1 mod 2° if there is no suffix 
AD'A but we know k = 1 mod 2' By Lemma 6, this 
introduces a small probability of error which can be 
decreased by taking a larger sample if necessary, or by 
further analysis, such as through a more exhaustive 
analysis of suffixes and their expected frequencies than 
there is space for here. Note, however, that if p,’ = 0 
then this choice of mo will not resolve which residue 
mod 2*' is correct. Hence an increase in security might 


be obtained by having p, = p2= ... = p; = 0 where / is as 
large as possible. 
Theorem 1. Assume each base 2' is selected with 


probability p; for odd key values, and digit 1 is only 
used for bases greater than 2. Let p; = p\t+pz+...+p,;and 


P;\ = \-pi. Suppose k is a random odd integer that 
has generated trace set Tr and j (1 <j < R+1) is such 
that Tr contains no trace with suffix eae anyi<j. 
Then k = mod 2 with probability T]{0+p,"")" . 


Proof. We prove this by induction on/. For / = 1 the 
statement claims nothing, and so holds. For the 
induction step, assume the statement holds for some 


Jj < R. Suppose also that 7r contains no trace with 
suffix AD‘A for any i <j. By the induction hypothesis, 
k=1 mod 2! with probability Ii. ad ee ye 


Since k is random, the two possibilities for k mod 2/*" 
are equally likely. So, by Lemma 6, no occurrence of 
suffix AD’A means k = 1 mod 2*' with probability 


dep y" This factor just needs multiplying into 


the product to obtain the claim for/+1 in place of 7. 


Theorem 2. With assumptions and notation as in 
Theorem \, suppose k is odd and j is minimal such that 


1 Sj <R+l and Tr contains a trace with suffix ADA. 
Then k = 2+1 mod 2" with probability 
Tiita+7")". 
Ifj <R we may take ky = 1 and mo= 
of traces for the associated k, is Tr ; 


2 and then the set 


Proof. Theorem 1 shows that, for the given definition 
of j, k = 1 mod 2 with probability []/}a+p,"")"'. 


If k= 1 mod 2*' then, as in the proof of Lemma 6, all 
traces must terminate with suffix D’*'4. which is not the 


case. Hence k = 2+] mod 2*! with the stated 


probability. 


For the base 79 = 2’, k = 1 mod mio and so the associated 
digit is k) = 1. However, k, = k div 2 is odd, which 
forces the next digit to be non-zero. Hence A is the 
next operation leftwards after the suffix D/A which 
corresponds to mo. Thus, the relevant traces for the 


next digit are those of 77,7. . 


The values of k for which no least significant digit has 
yet been assigned are those satisfying k = 1 mod 2°". 
Picking maximal base 70 = 2” gives ky = 1 and makes 
kp even. The associated set of prefix traces should be 
Tre . A possible difficulty with this definition is that 
for some traces removing the suffix D®A may split 
subsequences which correspond to one digit. However, 
every choice of base 2' corresponds to a suffix D’A 
where i < R, and must be followed by a number of 
instances of base 2 with digit 0 which makes the total 
modular division by at least 2°*'. Hence the suffix D°4 
corresponds to the o operations for a whole number of 
digits. Therefore Trz~ does indeed contain traces which 
represent only operations for sequences of complete 
digits, and so those traces all represent the same key 
value. 


Theorem 3. With the same assumptions and notation 
as in Theorem 1, suppose every trace in Tr has suffix 


D*'4. Then, with probability Tt dem ye, 


k = 1 mod 2™' and we may pick mo = 2°. For this 
choice ky= 1, k, is the common key for Trp, k, is even, 
and Trp? = = Trr jas the same cardinality as Tr. 


3.5 Combining Digits to Recover & 


For every position / at which there is an occurrence of A 
in some trace of Tr, the procedures of the previous sub- 
section can be applied to Tr/4 to obtain a base and digit 
at that point. These digits are used when determining a 
digit sequence for k. Starting at j = 0, the digits are 
selected iteratively. As well as a digit and base, each 
trace set Tr/‘ gives rise to another trace set defined at 
some position /’>j. We will show that: 


e For this definition of ;’, the next digit is determined 
by whichever is appropriate of Tr or (Tra? : 


Here we need to check on the definition of the trace 
subsets. If applied iteratively, the procedures above 
would actually determine smaller and smaller subsets: 
each time we apparently take a subset of the traces from 
the previous step. However, because only two key 
values (one odd. one even) are associated with any 
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position, every prefix which represents the operations 
of a complete number of digits must correspond to the 
odd key if it terminates with A and the even key if it 
terminates with D. Of course, every trace prefix 
terminating with A must consist of the operations for a 
whole number of digits since A cannot appear in the 
middle of the sequence of operations for a single digit. 
So every trace in Trj4 is generated from a key value 
which is common to them all. Hence, the full set Tr 
can be used to determine the next digit, not just the 
subset of 7r;* determined by the procedures above. 


In the case of the prefix trace set Tryst > it is not clear 
which traces are generated by a complete key. In some 
cases, the final D may not be the final operation of the 
digit sequence from which it was derived. Hence, the 
subset Cr? must be used, not Trak . However, the 
construction observed that every such trace had suffix 
D**'4. So (Trj*)g? has the same cardinality as 7r;*. 
Hence the trace subsets bulletted above are indeed the 
correct ones to use for the key digits, and they do not 
progressively decrease in size. 


The process of digit determination only begins to fail 
once a leading instance of A is encountered: Theorem 2 
guarantees progress up to that point. Traces are not all 
the same length. Some will use a large base for the 
most significant digit. Their initial Ds are deleted, 
giving them fewer instances of D overall, making their 
traces shorter. These traces are simply discarded when 
fully processed. The procedures above still apply to the 
subset. Again, following Theorem 2, further digits can 
still be defined until the trace set becomes empty. 
However, once the first (i.e. shortest) traces run out, the 
remaining key is representable by a single digit, so it is 
bounded in absolute value by 2° 7-1. Each increment 
of the position in the trace set reduces the representable 
key by a factor of 2. Eventually, assuming there are 
enough traces, the initial A of the longest trace has a 
digit bounded by 1, and so must be 1. Hence & is 
completely determined. “Enough” traces would be 
present if, for example, base 2 were chosen for the most 
significant digit. Insufficient traces just increases the 
number of possible values of k which may need testing 
by a small factor (under 2"). 


3.6 The Probability of Error 


We have been careful to obtain the probability of error 
in each digit in order i) to see if it is feasible to recover 
the key and ii) to see how the probabilities p; might be 
adjusted in the algorithm definition to provide 
improved security. 


The procedures of §3.4 define the probabilities in terms 
of the size of the trace set being employed at that time. 
Generally, it is equal to the cardinality of a set of the 
form Te This is equal to |7r| times the number of DAs 
in position / divided by the number of DAs or Ds in that 
position. This can be approximated by |7r| times the 
overall probability p, of DA divided by the overall 
probability pp of DA or D. Since the choice of base 
m = 2' produces i~1 occurrences of D followed by one 
of DA when i > 0, |Tr4| = x [77| where 
pos Sean ae 
Po Pot At+2 prt...+Rpp 


For a uniform distribution this works out at z = - 


where typically we might expect R = 3; and for 2°-ary 


sliding windows it works out at 7 = =1-. 


In fact, the formula under-estimates the average size of 
Tr;'. Some positions do not have any occurrences of A, 
and we do not use the associated trace subsets. This 
increases the average for those positions which do have 
occurrences of A. 


Next, the distribution of base choices in the 
reconstructed key differs from that generated by the re- 
coding process. Suppose & is odd for the set of traces at 
some point during the reconstruction. In Theorem 2, 
the distribution of odd residues k mod 2"*' is uniform. 
So, neglecting the assumed smal! numbers of 
incorrectly assigned cases resulting from some of the 
possible suffixes not occurring, base 2’ will be selected 
for the reconstructed key with probability 27 for 
0 <j <R and produce an odd next key. Further, base 


2® will tum up in the remaining 2 * cases of odd keys 
but produce an even next key. In half of all cases, an 
even key will lead to an even key. Consequently, out of 
every 2°+2 digit choices in the reconstruction, on 
average 2* will be odd and 2 will be even. 


By Theorems 2 and 3, the probability of the 
reconstructed key being correct is a product of factors 
ee 


of the form (1+ 9," These factors can be 


approximated by (1 pe . Since choosing base 


2 leads to j such factors, there is essentially one such 
factor for every bit of k. The exceptions are where an 
even key causes base 2 and digit 0. Then there is no 
doubt about the correctness of the digit 0. This last case 
occurs for 2(27+2) logok bits of the initial key & 
Otherwise, for odd keys, the relative frequency of 


different bases means that the factor (1+ p,””")"! will 
appear on average for 2° (2?+2)og.k bits if 


0 <i<Rand for 2(2°+2) 'logok bits if i= R. 
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Because pr = pitprt...+pr= 1, the factor for i= Ris 1, 
and so canbe ignored. Hence, 


Lemma 8. The key k can be recovered with a 
probability approximately 


SD eat 2p 
Th +3") 


2 1-R 


where n= (I+ 7. logek and x isas defined above. 


The property p;'< 1-p, provides a lower bound for this 
product. Consequently, 


Lemma 9. For a uniform distribution of base, the key k 
can be recovered with a probability at least 
Oe 


where n= (1-2! ®)(1+2'®) logok and m= 2e. 


For specific choices, it is possible to evaluate the 
product in Lemma 8 exactly. A typical choice would 
be to have a key with 192 bits, R = 3 (which requires 
storing two pre-computed multiples, namely P and 3P), 
and a unifiorm choice of base, i.e. p; = p2 = p3 = “A. 
Then z = % and the product in Lemma 8 is just under 
2”! for |Tr| = 9. So, if a key can be reconstructed and 
checked for correctness in unit tite, 


Theorem 4. /f doubles and adds can be distinguished 
on individual traces, and traces are captured from 9 
applications of the same unblinded 192-bit key, then the 
Liardet-Smart algorithm with uniform selection of base 
< 23 can be broken with a computational effortof about 
02", With twice as many traces, the computational 
effort falls to under O(2°°). 


Of course, the full force of all the patterns available and 
their relative frequencies has not yet been applied. 
Hence the danger is probably substantially under- 
estimated. Once a possible key has been recovered, 
there is considerable unused data in the traces that has 


not yet been used and can be investigated for checking 


purposes. In the uniform case, about £1. of the data is 


so fiar unused — that in the complementary sets Trp. 
This contains information about digits whose bases 
were not aligned with those of the reconstructed 
representation of k. Choosing a different base from that 
of the reconstruction process described above will 
provide confirmation about the correctness of each bit 
of k. Indeed, each trace has to be consistent with some 
choice of bases, and the rightmost inconsistency in a 
trace will usually be very close to the nghtmost bit in 
error. There is insufficient space here to improve the 
probabilities which are a consequence of this approach, 
but the computational feasibility of the attack is already 
assured. 
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If the attacker is unable to distinguish clearly between 
adds and doubles, then the unused data vastly increases 
his ability to make corrections. Moreover, as each digit 
is obtained through a purely local extraction of data 
from traces, it is easy to automate an exhaustive process 
to check for the overall best digit solutions using all 
traces, and hence prioritise the order for considering the 
most likely values for k. However, for the data that has 
been used, any indistinctness between A and D is 
unimportant. In this attack, it is only necessary to 
establish whether or not an A has appeared at each 
position. The relative frequency of As means that the 
certainty of this can be determined with high degree 
just by increasing the number of traces sufficiently. 


4 Counter-Measures 


Our formulae for bounding the accuracy repeatedly 
used the probabilities of smaller bases much more than 
larger bases, and the accuracy improves when these 
probabilities are increased at the expense of the 
probabilities of larger bases. This is consistent with the 
greater ambiguity afforded by digits of larger bases. 
Thus we recommend not using a unifiorm choice for the 
base, but employing a strong bias towards large bases, 
such as was illustrated in §2.2. In the extreme, 
the standard, non-randomised, m-ary exponentiation 
technique is obtained, and this is not susceptible to the 
attack. 


The cost of key masking is not entirely trivial in the 
context of ECC. Adding a 32-bit random multiple of 
the group order to the key increases the point 
multiplication cost by some 17% for 192-bit keys, 
although it is a much smaller fraction of the total 
encryption cost. Adding a smaller random multiple is 
probably ineffective if it results in a number of 
repetitions of the same key value within the lifetime of 
the key. The highly repetitive nature of the traces 
resulting from the same prefix keys turning up again 
and again means that a duplicated key could be 
assumed if, and only if, traces matched closely enough. 


The “double-and-add-always” method of computation 
provides a good measure of protection, but is 
expensive. The attacker then has to determine whether 
or not the result of the addition is used before he can 
mount the attack. This is much more difficult than 
distinguishing the two operations. Hence traces will be 
susceptible to much more frequent errors, and a much 
greater number of traces will have to be recovered. 


There are alternative randomised algoritluns for which 
this type of attack does not apply, and others that 
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display similar weaknesses. That of Oswald and 
Aigner [9] can be attacked in a similar way. MIST 
(15, 17] does not exhibit the same repetition of key 
values during key processing, and so may be a safer 
choice. A new algorithm by Itoh et al. [18] may also be 
worthy of consideration. 


5 Conclusion 


It might have been hoped that the Liardet-Smart 
algorithm would avoid the cost of any additional 
counter-measures such as key blinding when the same 
secret key is repeatedly re-used, but this now appears 
not to be so. Specifically, the key needs to be masked, 
or the pattern of adds and doubles has to be well hidden 
for individual point multiplications. 


Of course, there are many circumstances in which the 
algorithm is clearly of value, such as ECDSA, for 
which a different random key is used every time. Then, 
for suitable parameter choices, the space of keys 
generating a given pattem of adds and doubles is 
infeasibly large, and so cannot be attacked successfully 
without additional data. 
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Abstract 


Side-channel analysis is a powerful tool for retriev- 
ing secrets embedded in cryptographic devices such 
as smart cards. Although several practical solutions 
have been proposed to prevent the leakage of sensi- 
tive data, mainly the protection of the basic crypto- 
graphic operation itself has been thoroughly investi- 
gated. For example, for exponentiation-based cryp- 
tosystems (including RSA, DH or DSA), various 
exponentiation algorithms protected against side- 
channel analysis are known. However, the expo- 
nentiation algorithm itself or the underlying crypto- 
algorithm often involve division operations (for com- 
puting a quotient or aremainder). The first case ap- 
pears in the normalization (resp. denormalization) 
process in fast exponentiation algorithms and the 
second case appears in the data processing before 
(resp. after) the call to the exponentiation opera- 
tion. 


This paper proposes an efficient division algorithm 
protected against simple side-channel analysis. The 
proposed algorithm applies equally well to software 
and hardware implementations. Furthermore, it 
does not impact the running time nor the memory 
requirements. 


Keywords. Division algorithms, smart cards, side- 
channel analysis, SPA protected implementations. 


1 Introduction 


Significant progress has been made these last years 
to secure cryptographic devices (e.g., smart cards) 
against side-channel analysis. Side-channel anal- 
ysis [2, 3] is a clever technique exploiting side- 
channel information (e.g., power consumption) to 
retrieve secret information involved in the execution 


of a carelessly implemented crypto-algorithm. The 
threat is now clearly understood by implementors 
and various countermeasures have been suggested. 


The basic operation underlying most public-key 
crypto-algorithms is the modular exponentiation. 
To name a few, this includes the RSA cryptosys- 
tem, the Diffie-Hellman key exchange or the DSA 
signature scheme. The resistance of modular expo- 
nentiation with respect to side-channel analysis is 
discussed in many papers (e.g., see [4] where both 
attacks and counter-measures are presented). A 
far less studied operation is that of division: to 
the authors’ best knowledge, there is no paper in 
the public literature addressing this issue. This 
is most unfortunate as nearly all implementations 
of exponentiation-based cryptosystems use the divi- 
sion operation as well. 


Several specialized modular multiplication algo- 
rithms (and therefore the corresponding modular 
exponentiation algorithms) require a normalization 
step involving an integer division. Typical exam- 
ples include Barrett algorithm [5] or Quisquater al- 
gorithm [6] (see also [7]). For computing a-b mod m, 
these two algorithms take on input a normalization 
factor of the form p = |2'/m]. If the division algo- 
rithm used for evaluating p is prone to side-channel 
analysis then the value of m (or some related infor- 
mation) can be recovered. When 7m is a secret data, 
this compromises the security of the cryptosystem. 
For example, this occurs when RSA decryption (or 
signature) is speeded up through Chinese remain- 
dering [8] because then modulus m is successively 
one of the two secret RSA primes, p; and po. A 
second example of division algorithm manipulating 
secret data is when RSA is used with Chinese re- 
maindering and operand z in the computation of 
x4 mod {p,p9} is first explicitly reduced modulo p; 
prior to the exponentiation x? mod p,, for i = 1,2. 


An algorithm commonly used for computing in- 
teger divisions is the classical binary pencil-and- 
paper method (or a variant thereof). This algo- 
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rithm presents the advantage of requiring no ex- 
tra memory requirements. However, as we will see, 
it may yield the value of quotient g = adivb dur- 
ing its computation by simple side-channel analysis. 
This paper is aimed at transforming this algorithm 
into a division algorithm protected against simple 
side-channel analysis while preserving the efficiency 
(memory-wise) of the classical algorithm. Actually, 
the resulting algorithm will not only be protected 
against simple side-channel analysis but will fur- 
ther be faster than the classical algorithm, with the 
same memory requirements. As a result, we obtain 
a protected division algorithm that is few greedy in 
memory and is particularly suited to a hardware im- 
plementation or to a software implementation in a 
constrained environment like a smart card. 


The rest of this paper is organized as follows. The 
next section reviews the classical binary pencil-and- 
paper division algorithm. Its security towards sim- 
ple side-channel analysis is studied in Section 3. 
Building on the pencil-and-paper method, we then 
propose in Section 4 our protected yet more effi- 
cient division algorithm. Finally, we conclude in 
Section 5. 


Disclaimer. This paper only addresses security 
against simple side-channel analysis, that is, side- 
channel analysis from a single measurement of cer- 
tain side-channel information. In particular, it is 
not concerned with differential analysis (such as 
DPA) or more sophisticated methods. 


2 Pencil-and-Paper Division Method 


Given a and b on input, the binary pencil-and- 
paper algorithm evaluates the quotient gq = adivb 
(alternatively, we use the notation g = |¢|) and 
the remainder r = amodb. The binary repre- 
sentations of a and 6 are respectively given by 
@ = (am-1,...,@9)2 and b = (by-1,...,b9)2 with 
ba=1 # 0. 


It is easy to see that the pencil-and-paper division 
of two integers amounts to the simpler problem of 
dividing a (n+ 1)-bit integer A by a n-bit integer b 
and then to re-iterate the process [1]. We must have 
0 < A/b < 2, which is satisfied whenever bn-1 # 0 
(see above restriction). 
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Since we are working in basis 2, the two possible 
values for the quotient bit |A/b] are 0 or 1. So we 
subtract b from A and test whether the obtained 
result is nonnegative; if so then |A/b| = 1, and 
[|A/b] = 0 otherwise. Remark that |A/b] = 1 if 
and only if A-b > 0. 


The length of r = amodbd plus the length of 
q = adiv b is smaller than or equal to (m+ 1) bits. 
Indeed, the length of r is at most n bits since r < b; 
and the length of g is at most (m — n + 1) bits 
sintery. = [a/b < |[e/Guae" 4) = adi? = 
(Qm—1,---,@n-1)2, a (m—n+1)-bit value. In or- 
der to save memory, the quotient and remainder will 
be written in the register containing a (augmented 
with one leading bit). 


Before: 


ge 


[oe A — IO le © 
“we N 


(n bits) (m —n +1 bits) 


After: 


Figure 1: Memory configuration. 


It is useful to introduce some notations. For a k- 
bit integer a, we denote by SHL,(a,1) the operation 
consisting in shifting a of one bit to the left; the 
outgoing bit is affected to the carry. For two k-bit 
integers a and b, we write ADD, (a, b) for the addition 
of aand 6; variable carry is set to 1 if there is a carry 
in the addition and carry is set to 0 otherwise. Re- 
mark that SHL;(a,1) can equivalently be obtained 
as ADD,(a@,a). There is usually no subtraction oper- 
ation available for large integers. The subtraction 
of b from a is obtained by first computing the two’s 
complement of b, denoted by b, and then by adding 
b to a. Indeed, if b is a k-bit integer then b +b = 2* 
and soa—b=a+b (mod 2‘). We write CPL2;(a) 
the operation consisting in taking the two’s comple- 
ment of a k-bit integer a. Symbols V, A and © refer 
to the bit-wise logical operations OR, AND and XOR, 
respectively. For a bit a, the negation of o (i.e., its 
complementary value) is denoted by 7c. Finally, 
the notation Isb(a) refers to the least significant bit 
of an integer a. 


We can now present the classical binary pencil-and- 
paper division algorithm. On input a and 5, this 
algorithm computes both the values of adivb and 
of a mod 6. In order to work, a is artificially aug- 
mented with one bit (initially set to 0) at the most 
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significant position. Moreover, to ease the exposi- 
tion, variable A represents the n most significant 
its of a) ie, A = (0) Gacy aes Gece): 


@= (0:0 —1y+2 Qa)a 
b = (bp_1,---,bo)2 
Output: g=adivb and r=amodb 
b + CPL2,(b) /* 6 = “—b” */ 
for j=1 to (m—n+1) do 
a ¢- SHLm41(a,1) /* Shift */ 
o ¢- carry 
A ¢- ADD,(A,b) /* Subtract */ 
o¢-oV carry 
if (nc) then /*Correction*/ 
b ¢- CPL2,,(b) 
A ¢ ADD,(A,}) 
b ¢- CPL2,,(b) 
else Isb(a) =1 


Input: 


endfor 
b ¢ CPL2,(b) 





Figure 2: Binary pencil-and-paper division algo- 
rithm. 


Remainder r is in A followed by the quotient g. The 
correctness of the algorithm follows by observing 
that if @ ¢- SHLm4,(@,1) generates a carry then 
Qm = 1 (before shifting) and so b must be sub- 
tracted; moreover if @m = 0 (before shifting) and 
A ¢- ADD,(A,5) generates a carry (i.e. A—6 > 0 
before the subtraction) then again 6 must be sub- 
tracted. 


Example 1 Suppose we want to compute adivb 
and/or a mod b with a = 4096 = (1000000000000)> 
and b = 81 = (1010001). The 2’s complement of b 
is 6 = (0101111). We obtain r = 46 = (101110), 
and q = 50 = (110010). 


3 Security Analysis 


This section explains why security may be a concern 
when implementing a division algorithm. 


Back to the binary pencil-and-paper division al- 
gorithm (Fig.2), we see that the quotient is con- 
structed bit by bit. According to its value, each 
quotient bit is obtained by a different sequence of 
operations. At step j in the for-loop, if the ob- 
tained quotient bit, say gj, has value 0 then the 
following operations were performed: 


e a shift [SHLm+ (a, 1)] 
e an addition [ADD,(A, })] 
e a “correction” [CPL2,,(b); ADD,(A, 6); CPL2,,(b)| 


along with some logical operations; whereas if q; = 1 
then the following two operations were performed: 


e a shift [SHLm4,(a, 1)] 


e an addition [ADD,,(A, 5)] 


along with some logical operations. If it is possi- 
ble to distinguish these two sequence of operations 
during the course of the algorithm then the value 
of quotient bit g; (and thus of the whole quotient 
q) can be recovered. Such a means may be pro- 
vided by monitoring the power consumption (i.e., 
the side-channel is the power consumption). The 
next figure represents a power trace resulting from 
the execution of the pencil-and-paper division algo- 
rithm on a chip equipped with a crypto-coprocessor. 








a og The operands are large numbers and the various op- 
Oe ON SS ey eae es erations (shift, addition or two’s complement) are 
Shift 10000000000000 0 
Pace «LLL Pas 0 performed by the crypto-coprocessor . 
Correction 1000000 
Shift 00000000000000 1 if 1} 
Subtract 0101111 hed a fain Ni | Atte | 
Shift 10111100000010 0 \|f | 
Subtract 0001101 lL 2 } { Wal 
Shift QOLTOVOCOO 0 11.0 0 | 
Subtract 1001001 0 
Correction 0011010 
ee: : ; . i 7 ee aa ; Figure 3: A power trace of the pencil-and-paper 
Correction 0110100 division algorithm. 
Shift 11010000011000 0 
Subtract 0010111 1 1 We can identify two different patterns in Fig. 3: one 
Shift 01011100110010 0 d : 
Situs Tata ae 0 corresponds to the case gq; = 0 (i-e., a longer pattern 
Correction 01011100110010 involving an additional “correction”) and the other 
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one corresponds to the case gj = 1. Therefore, the 
value of quotient g = adivb can easily read from 
the power trace. 


If quotient gq (or related data) is secret, this above 
implementation is not secure. Consider the exam- 
ple of RSA implemented with Chinese remainder- 
ing. Let N = pipe be an RSA modulus (the val- 
ues of p, and p2 are secret). The computation of 
y = zt? mod N is carried out as y = CRT(y1, y2) 
where y; = 2;4 mod p; with 2; = x mod pj, for 
i = 1,2. Suppose that z, and x2 are computed 
by the binary pencil-and-paper algorithm as given 
in Fig.2. Then, by simple power analysis! (SPA), 
the values of gq, := |2/pi| and g2 := |2/p2| can 
be recovered from the corresponding power traces. 
Suppose further that s = N—r for some 0 <r < P1.- 


Then 
aes 
nu = = |g9—- —| =fa—1 
Pi Pi 


and so the secret RSA primes are given by po = 
© + land p, = N/po. 





4 <A Protected Method 


The previous analysis illustrates that non-constant 
code may reveal sensitive data, thereby compro- 
mising the security of the cryptosystem. A first 
idea to make the code constant consists in adding 
some dummy operations and in making implicit the 
if-then-else statement. Such a solution is how- 
ever unsatisfactory as dummy operations penalize 
the running time. This is especially true when the 
dummy operations are time-consuming 


Rather, we exploit the following observation. At 
each iteration, the pencil-and-paper algorithm 
(Fig.2) computes a +- 2a — b2™~-"+!. In the case 
of an unsuccessful guess (i.e., when a = 0), one has 
to restore the value of a by setting A +- A+ b. 
This restoring step can be avoided by noting that 
2(a +b2%) — 62° = 2a + b2%. We then obtain the 
non-restoring variant of the classical binary pencil- 
and-paper division algorithm. An additional vari- 
able, a’, keeps track of the value of bit o in the 
previous iteration. Bit @ keeps track of the ‘sign’ 
of b. 


!' That is, a side-channel analysis using a single power con- 
sumption measurement as side-channel information. 


Input: @=(dm-i,--- ,aQ)2 
b= (bp-1,...,bo)2 
Output: g=adivb and r=amod b 
o 1; Bel 
for j7=1 to (m—n+1) do 
a¢- SHLm4i(a,1); o +- carry 
if (o') 
then if (8) then b¢-CPL2,(b); 6B +-0 
A +- ADDn(A,b); o +-o V carry 
else if (-8) then b+ CPL2,(b); B+-1 
A+ ADD,(A,b); o + oAcarry 
if (oc) then Isb(a) =1 
o't-o 
endfor 
if (79) then b ¢- CPL2,(b) 
if (-0) then A + ADD,(A,)) 


Figure 4: Non-restoring binary division algorithm .? 


The previous algorithm does not behave regularly 
and so may also be subject to side-channel analysis. 
According to the values of # and of o’, register b 
is unchanged or replaced by its two’s complement. 
A first step towards side-channel protection consists 
in always performing a two’s complement followed 
by an addition, whatever the values of @ and o’. To 
this end, when register 6 needs not to be replaced by 
its two’s complement, a dummy two’s complement 
—on a register, say register c, that does not impact 
the computation— is executed. We call dggg, the 
address of the register containing the value that will 
be replaced by its two’s complement (daddr will be 
baddr OF Caddr). 


It is also worth noting that o can be updated as 
o ¢- o'(o V carry) + (70')(o /\ carry), which can 
equivalently be rewritten as 


a +¢-(aAo0') ®(oAcarry) @ (o’ A carry) . 


Finally, noting that the shift operation sets the 
least significant bit of a to 0, the line [if (c) then 
lsb(a) = 1] can be replaced by Isb(a) ¢«- o. 


We use £ and ¥ variables to keep track of the ‘sign’ 
of the value contained in the registers located at 
baddr and Caddr, respectively. The convention is 8 = 
O (resp. yy = 0) when the value located at badar 
(resp. Caddr) is the original value, and # = 1 (resp. 
-y = 1) when the value located at baddr (resp. Caddr) 
is the two’s complement of the original value. We 


3Again variable A represents the n most significant bits 
of variable a, i.e, A= (0,am-1,-.- ;Qm—n+1): 





72 


CARDIS °02: 54 Smart Card Research & Advanced Application Conference 


USENIX Association 


have the following truth table: 








Q 








rPrRrere OOOO 

PrRoOoOrrFcocol 
FP Orordroljr 
egoocorrreS|D 
Ke OoOorore Of] 


where-from we derive the outgoing values 8 +- —0’ 
and y+-7@o' @B. 


Putting all together, we finally obtain the following 
algorithm. 


Input: a= (a@m-},.--,Q0)2 
b= None ay a agg 


Output: g=adivb and r=amodb 


o ¢-1; Be 1; ye l 

for j =1 to (m-n+1) do 
a + SHLm+1(a,1) /* Shift */ 
a¢-carry; 6+¢-0'@B 
deddr = baddr = g 8(Cadar a badar) 
d ¢- CPL2,(d) /*Two’s complement */ 
A +- ADD,(A,b) /* Addition */ 
a ¢- (ao Ao0') ®(aAcarry) ®(o' A carry) 
Bees QA Bs OF Lo 
Isb(a) =o 

endfor 

/* Final correction*/ 

if (48) then b ¢- CPL2,(b) 

if (7 ) then c+¢- CPL2,(c) 

if (30) then A ¢- ADD,(A,b) 


Figure 5: Our protected division algorithm.‘ 


One may argue that our algorithm is not code- 
constant because of the three last if-then state- 
ments. We note however that the two first are not 
mandatory to make the algorithm working but are 
merely performed to reset the registers containing 
b and c to their original values. Finally, the last 
if-then only reveals the least significant bit of the 
quotient; when this is a secret value a dummy oper- 
ation can be applied to mask the potential addition 
ADD, (A, 8). 

4As in Figs.2 and 4, variable A represents the n most 
significant bits of a (i.e., A = (0,am—1)-.-;@m—n+1)): 


Example 2 We take the same example as before: 


a 


= 4096 = (1000000000000)2 and b = 81 = 


(1010001). (2 = 0101111). As detailed below, we 
obtain r = 46 = (101110)2 and g = 50 = (110010)2. 


a ao B 
01000000000000 x*1 
Shift 10000000000000 01 
CPL2n(b) 
ADDnA(A,O) 1101111 00 
Shift 100° Le 1:0.0'0'0°00:07:0 10 
CPL2y (b) 
ADDn(A,b) O101111 ab 
Shift 101111060000010 01 
CPL2y (b) 
ADD,(A,b)} 0001101 12 ae0 
Shift 00110100000110 00 
CPL2n(c) 
ADDnA(A,b) 1001001 00 
Shift * 00100100001100 10 
CPL2, (6) 
ADDnA(A,b) 1100011 0. 1 
Shift 10¢001120001%1000 1 1 
CPL2n(c) 
ADDn(A,b) 0010111 1 ag D 
Shift 01011100110010 01 
CPL2,(b) 
ADDn(A,b) 1011101 00 


CPL2n(b) (Final corr. on b) 
Final corr. 01011100110010 00 


The corresponding power trace is given in Fig.6. 


Remark that the power trace is now the repetition of 


a 


b 


same pattern, regardless the value of the quotient 
it. 
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Figure 6: A power trace of our division algorithm. 


Finally, as a side-effect, it is easy to see that, on 
average, our protected algorithm outperforms the 
classical pencil-and-paper method, with the same 
memory constraints (cf. Fig. 1). 


Table 1: Comparison with the classical method. 


ADD, SHLin+1 CPL2n 


Classical 3(m—n+1) m=n+1l m-—n+3 


Protected m—n+3 m—-nt+1 m-—n+2 
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5 Conclusions 


This paper presented a new division algorithm pre- 
venting simple side-channel analysis. The proposed 
algorithm is well suited to a hardware implemen- 
tation or to a software implementation in a con- 
strained environment. Remarkably, it does not re- 
quire additional resources (time or memory) and is 
even faster than the classical binary method. 


Obviously, we note that SPA-like analysis highly de- 
pends on the hardware and special care must be 
paid by the implementor. In this respect, the pro- 
posed method can be seen as a useful framework 
for designing protected and, as shown in the paper, 
efficient division algorithms. 
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Abstract 


Transacted Memory offers persistence, un- 
doability and auditing. We present a 
Java/JML Reference Model of the Transacted 
Memory system on the basis of our earlier 
separate Z model and C implementation. We 
conclude that Java/JML combines the advan- 
tages of a high level specification in the JML 
part (based on our Z model), with a detailed 
implementation in the Java part (based on 
our C implementation). 


1 Introduction 


In a previous paper [6] we introduced 
Transacted Memory as an efficient means to 
implement atomic updates of arbitrarily sized 
information on smart cards. Smart cards 
need such a facility, as a transaction can be 
aborted by a card tear, i.e. by pulling the 
smart card out of the Card Acceptance De- 
vice (CAD), at any moment. A patent ap- 
plication has been filed for this Transacted 
Memory [5]. Its design allows a much smaller 
implementation overhead than the transac- 
tion mechanism in the current Java Card 
API', which does not even provide genera- 
tional, logging, or multiple concurrent trans- 
actions. 


1 Java and all Java-based trademarks and logos are 
trademarks or registered trademarks of Sun Microsys- 
tems, Inc. in the U.S. or other countries, and are used 
under license. 


In our earlier paper we provided a succinct 
abstract Z specification [13] of the system, a 
first Z refinement that takes into account the 
peculiarities of EEPROM memory (i.e. byte 
read versus block write), a second Z refine- 
ment that deals with card tear, and, finally, 
an (inefficient) C implementation. (The inef- 
ficiency is due to the use of many simple for- 
loops that search the memory; we are working 
on a VHDL specification of a hardware mod- 
ule that will replace the for-loops by efficient 
parallel searches but this is beyond the scope 
of the present paper.) The C implementation 
has been coded in such a way that it also 
serves as a SPIN [8] model. 


From our earlier work we concluded that a 
formal connection between specification and 
implementation would have been highly de- 
sirable, yet such a connection cannot be ob- 
tained using Z and C. While a formal con- 
nection can be established using SPIN, we 
believe the readability leaves much to be de- 
sired, as specification and implementation 
tend to be intertwined in a SPIN model. 


In the present paper we adopt an inte- 
grated approach to specification and imple- 
mentation that solves the problems of read- 
ability and the lack of a formal connection be- 
tween specification and implementation. We 
use the Java/JML [9] modelling method and 
tools, which means we write formal specifi- 
cations by annotating the Java code with in- 
variants, preconditions, and postconditions, 
using the specification language JML (see 
www. jmlspecs.org). These formal specifi- 
cations can then be compiled into runtime- 
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checks [4], providing a convenient way of 
checking specifications against code. The 
Java/JML modelling method and the run- 
time assertion checker ensure a strong, formal 
connection between Java implementation and 
JML specification. 


In the present work we apply Java/JML to 
what we hope will become a component of a 
future version of the Java Card technology. 
JML has already been used to specify the en- 
tire Java Card API [11, 12], and other tools 
than the runtime assertion checker have al- 
ready been used to verify JML specifications 
of Java Card applets (3, 1]. 


The contributions of the present paper are: 


e Several bugs have been detected and 
repaired in the implementation of the 
Transacted Memory. 


e We make the pre- and postconditions of 
the memory operations explicit in the 
JML specifications. The readability of 
these specifications is better because the 
reader does not have to trawl through 
the entire Z specification to discover the 
pre- and postconditions. The connection 
between specification and implementa- 
tion is formal, and has been checked us- 
ing the runtime assertion checker. 


e The previous C implementation cum 
SPIN model relied on implicit meth- 
ods of modelling the recovery from card 
tears. In the Java/JML model we use 
exception handling as an explicit, clearer 
method for modelling recovery. This al- 
lows us to test the behaviour of the Java 
implementation in the presence of (simu- 
lated) card tears, and to use JML to pre- 
cisely specify the conditions that should 
hold after a card tear. 


e We contribute a reference model of the 
Transacted Memory system to SUN’s 
collection, instead of just a reference im- 
plementation. The difference is in the 
presence of the formal JML specification. 


In Section 2 we review briefly how Trans- 
acted Memory works. Section 3 describes the 
Java implementation of the system, Section 4 
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discusses the JML specification for this Java 
implementation. The last section concludes. 


Revise | 


Abstract 


Refine 













Refinement 1 


Informal 


Informal 


Figure 1: The process 


2 The Transacted Memory 


Figure 1 describes the relationship between 
the various specifications and implementa- 
tions of the Transacted Memory system. The 
Java/JML reference model, which is the sub- 
ject of this paper, was derived by hand from 
the closely corresponding C implementation 
cum SPIN model for the Java part, and from 
the final refinement of the Z specification for 
the JML part. While Java and C are similar 
in many ways, there are some important dif- 
ferences, discussed in Section 3 below. Here 
we concentrate on how Transacted Memory 
works, giving excerpts of the abstract Z spec- 
ification to make the present paper self con- 
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tained; the details are in [6, 2]. 


Transacted Memory is designed around two 
notions: tags and information sequences. A 
Tag is merely a unique address, i.e. an identi- 
fier of a particular information sequence. An 
information sequence is.a sequence of /nfo’s, 
where Jnfo is the unit of data stored and re- 
trieved. A sequence of Info’s would be used 
to store a collection of object instances that 
are logically part of a transaction. 


The abstract Z specification (below) makes 
no specific assumptions about either compo- 
nent: 


[Tag, Info] 


The existence of a finite set of available tags 
is assumed (tags), as well as limits on the size 
of the memory (msize). There may be sev- 
eral generations of the information associated 
with a tag, and there is a maximum number 
of generations that may be associated with 
any tag (mazgen): 


tags : F Tag 
msize : N, 
mazgen : N, 


The abstract Z specification represents the 
memory system as two partial functions as- 
soc and size and a set committed, as shown 
below. We have omitted the constraints on 
the partial functions and the set: 


AMemSys 

assoc : tags + seq(seq Info) 
size : tags + N, 
committed : P tags 





The assoc function associates a tag with 
a sequence of sequences of information. The 
first sequence of information represents the 
current information associated with a tag. 
Any further information sequences give older 
generations of this information, in order of 
increasing age. 


The size function gives the length of the 
information sequences associated with a tag. 


The committed set records those tags for 
which the current state of the transacted data 
has been committed. 


Operations are provided to write a new 
generation, and to read the current or older 
generations. All generations associated with 
a tag have the same size, although this could 
be generalised. 


The transaction processing capability of 
the memory is supported by a commit oper- 
ation, which makes the most recently written 
information the current generation. The old- 
est generation is automatically purged should 
the number of generations for a tag exceed a 
preset maximum. It should be noted that the 
support for recording multiple generations, 
which can be useful for logging, essentially 
comes for free, ie. without any additional im- 
plementation cost. 


As an example, the abstract Z specifica- 
tion of the operation ACommit is shown be- 
low. The operation commits the current gen- 
eration of information associated with a tag. 
The tag must have an associated information 
sequence, which is flagged as committed. 


ACommit 
AAMemSys 
t? : tags 







t? € dom assoc 
assoc t? # {) 
committed’ = committed U {t?} 





The Transacted Memory must be used in 
such a way that a sequence of operations ei- 
ther completes normally, or that a sequence is 
interrupted at an arbitrary moment by a card 
tear. A recovery operation Tidy is provided 
to return the Transacted memory to a known 
state. The idea is that each time the card is 
inserted in the CAD, the recovery operation 
is automatically started. 


Transacted Memory thus provides undoa- 
bility (by being able to revert to a previ- 
ous generation) and persistence (by using 
EEPROM technology). These are precisely 
the ingredients necessary to support transac- 
tions [10]. 
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To provide this functionality, Transacted 
Memory maintains a certain amount of book- 
keeping information. In its most abstract 
form, the bookkeeping information records 
three items: 


e The length of the information sequence 
that is associated with a tag. 


e The different generations of information 
associated with each tag. It is possible 
that there is no information associated 
with a tag. 


e Which tags are currently committed. 


The details of the Z specification may be 
found in a technical report [2]; here we fo- 
cus on the API of the Transacted Memory, 
taken from our previous paper {6] and shown 
in Figure 2, because this is where the pre- and 
postconditions of the Java/JML specification 
provide the major contribution to readability 
and rigour. 


3 A Java implementation of 
Transacted Memory 


The Java implementation was obtained by 
manually transliterating the C code to Java 
code. This is not difficult as the languages 
are close, and for a program of this size (1200 
lines) the effort involved is small. We have 
been careful in transliterating the C code, 
and we are confident that our Java implemen- 
tation closely mimics the C implementation. 
There are two essential differences between 
the Java implementation and the C imple- 
mentation, as explained below. 


Static Type Checking 


The C implementation contains several 
macros to define “types” for the different 
kinds of numeric values (bytes) that are used, 
such as generations, locations, page numbers, 
tags, versions, etc.: 


#define Gen byte /* 0 .. maxgen */ 
#define Loc byte /* 0 .. msize-1 */ 
#define PageNo byte 

#define Tag byte /* 0 .. tsize-1 */ 
#define Ver byte /* 0 .. 2 */ 
#define Inf byte /* 0 .. isize-1 */ 
#define Seq byte /* 0 .. ssize  */ 


These are just macros, and although they in- 
crease the readability of the code, they do not 
provide any type-safety. 


In the Java implementation we have cho- 
sen to use different classes for these different 
kinds of values. This is inefficient since we 
make what is just a simple byte into an ob- 
ject. The inefficiency is not a primary concern 
here; we believe it to be more important for 
a reference model to be as clear and concise 
as possible?. Modelling bytes by classes has 
the advantage of providing type-safety, as for 
instance ‘generations’ and ‘tags’ are no longer 
assignment-compatible. Interestingly, this in- 
creased type safety immediately revealed a 
bug in the C code (and SPIN model): in one 
place a ‘version number’ was used in a place 
where a ‘page number’ was expected. This 
bug seems to have been a simple typo in the 
C code. This bug was not discovered in the 
model checking using SPIN, nor in testing of 
the C implementation, because the test har- 
ness for the Transacted Memory used there 
was fairly restricted. 


The discovery of this bug illustrates the 
value of a statically enforced type system. 
Especially for code like that of the Trans- 
acted Memory, which is littered with differ- 
ent ‘kinds’ of bytes, it is easy to confuse a 
byte representing a page number with a byte 
representing a ‘version’. It is a pity that C 
and Java do not have type-safe enumeration 
types, and that JML does not improve the 
level of expressiveness of the Java/JML com- 
bination in this respect. 


2 Also, the Java Card technology offers the possi- 
bility to optimize API components, such as the trans- 
acted memory API, in the offcard converter. 
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typedef struct { Gen old, new ; byte cnt ; } GenGenbyte ; 
structure used to hold the number of the oldest and newest generation, 
and the number of generations. 


typedef struct { Size size ; Info datal[ssize] ; } InfoSeq ; 
structure used to hold an information sequence and its size. 


GenGenbyte DGeneration( Tag ) ; 
Return all available information for the given tag. The result is undefined 
if the tag is not in use. 


Tag DNewTag( Size ) ; 
Return an unused tag of the specified size. The result is undefined if no 
tag is available. 

void DTidy( ) ; 
Recover from an interrupted write operation. 


InfoSeq DReadGeneration( Tag, Gen ) ; 
Read the information sequence of a given tag and generation. The in- 
formation sequence is undefined if the tag is not in use. 


InfoSeq DRead( Tag ) ; 
Read the information sequence of the current generation associated with 
the given tag. 


void DCommit( Tag ) ; 
Commit the current generation for the given tag. The operation has no 
effect if the tag is already committed. 


void DRelease( Tag ) ; 
Release all information associated with the given tag. The operation has 
no effect if the tag is not in use. 


void DWriteFirst( Tag, InfoSeq ) ; 
Write to a tag immediately after the DNewTag operation. The result is 
undefined if insufficient space is available. 


void DWriteUncommitted( Tag, InfoSeq ) ; 
Write to a tag whose current generation is uncommitted. 


void DWriteCommittedAddGen( Tag, InfoSeq ) ; 
Write to a tag whose current generation has been committed, and whose 
maximum number of generations has not been reached. 


void DWriteCommittedMaxGen( Tag, InfoSeq ) ; 
Write to a tag whose current generation has been committed, and whose 
maximum number of generations has been written. The oldest genera- 
tion will be dropped. 





Table 1: Transacted Memory data structures and functions for C. 
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Modelling card tear 


The second, and more important, aspect in 
which the Java implementation essentially 
differs from the C implementation is that 
we use Java’s exception mechanism to model 
card tears. We introduce a special exception 
class CardTearException, and a card tear is 
simulated by throwing this exception. This is 
useful, because it allows us 


1. to test the behaviour of the program 
when card tears occur; in the Java 
method that models atomic writes to 
EEPROM we can easily simulate ran- 
dom card tears by randomly choosing to 
throw a CardTearException or not, be- 


fore or after the atomic write to EEP- 
ROM. 


2. to specify in JML the properties that 
should hold after a card tear occurs; this 
will be discussed in Section 4. 


In fact, though we will not pursue this point 
in this paper, a card tear can be modelled 
very accurately as an (uncatchable) Java ex- 
ception, for which the power-on mechanism 
of the card provides the exception handler; 
see [7]. 


In a later stage we will also introduce Java 
exceptions to signal that there is insufficient 
free transacted memory to carry out an oper- 
ation, as discussed at the end of Section 4. 


4 JML specifications for the 
Java implementation 


The Java Modeling Language (JML) [9] is 
a behavioural interface specification language 
tailored to Java. JML is developed primarily 
by Gary T. Leavens at Iowa State University. 
Java programs can be specified using JML 
by annotating them with invariants, pre- and 
postconditions, and other kinds of assertions. 
JML combines features of Eiffel (or ‘Design 
by Contract’) and model-based approaches, 
such as Larch/LSL and VDM. 


JML annotations are written as a special 
kind of Java comments. This means they are 


ignored by normal Java compilers, but can be 
used by special tools for JML. The tools we 
have used on our JML-annotated code are the 
JML type-checker and the JML runtime as- 
sertion compiler [4]. Both these tools can be 
downloaded from www. jmlspecs.org. The 
runtime assertion compiler turns annotations 
into runtime checks, so that any violation of 
an annotation at runtime produces an error. 


To create the JML specifications for the 
Java implementation, elements of the Z speci- 
fications and of the informal comments given 
in the C code were converted into pre- and 
postconditions, class invariants, and loop in- 
variants. The JML specifications we have 
written are partial in the sense that they do 
not give a complete specification of Trans- 
acted Memory. Still, the specifications do ex- 
press the main properties that should hold for 
the Transacted Memory, and have proven to 
be sufficiently detailed to find bugs, as we will 
discuss later. 


Figure 2 gives an example of a JML spec- 
ification, namely the specification of the 
method DWriteUncommitted. The JML spec- 
ification is written between the annotation 
markers /*@ and @*/. 


The first three lines of the JML specifica- 
tion, starting with requires, give the pre- 
condition of the method. Here the precon- 
dition is that the tag should be in use, the 
information sequence i should be of the right 
length, and the tag should not be commit- 
ted. When doing runtime assertion checking, 
any invocation of DWwriteUncommitted which 
violates these preconditions will produce an 
error message’. 


The next two lines, starting with ensures, 
give the postcondition of the method. The 
first of these lines says that if we read back 
the value for tag using DRead we get the value 
i we just assigned to it, the second says that 
the tag is still not committed. When do- 
ing runtime assertion checking, any invoca- 
tion of DWriteUncommitted which does not 


3 Actually, JML is so expressive that some JML 
assertions are not decidable, e.g. assertions using the 
keyword forall to quantify over an infinite domain; 
these (parts of) JML assertions are not compiled into 
runtime checks. 
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/* Write to a tag whose current generation is uncommitted. +*/ 


/*@ requires ddata[tag. value] .tagInUse; 


// tag in use 


requires ddata[tag.value].size == i.seq; // i of right length 


requires ! ddata[tag.value] .committed; 


ensures DRead(tag) . equals (i); 


ensures ! ddata[tag. value] .committed; 


// tag uncommitted 


// i written successfully 
// tag still uncommitted 


signals (CardTearException) ! ddata[tag.value] .committed; 
signals (CardTearException) DRead(tag) .equals (i) 
|| DRead(tag) . equals (\old(DRead(tag))) 


@+/ 


public void DWriteUncommitted(Tag tag, InfoSeq i) 
throws CardTearException; 


Figure 2: JML specification of DwriteUncommitted 


establish these postconditions wil] produce an 
error message. 


Finally, the last lines of the JML specifi- 
cation, starting with signals, give the ez- 
ceptional postcondition. Whereas ensures 
clauses specify the ‘normal’ postconditions, 
i.e. properties that should hold after normal 
termination of a method invocation, signals 
clauses specify properties that should hold 
at the end of a method invocation if an ex- 
ception is thrown. The first signals clause 
here says that if a CardTearException is 
thrown then the tag remains uncommitted. 
The second signals clause says that if a 
CardTearException is thrown, then either 


DRead(tag) .equals(i) 
or 
DRead (tag) .equals(\old( DRead(tag) )) 


i.e. reading back the value for tag either pro- 
duces the ‘new’ value i just written or it pro- 
duces the ‘old’ value of DRead(tag). The 
JML keyword \old is used here to refer to 
the value an expression had before execution 
of the method. 


Note that the information sequence i may 
consist of several bytes, and that a single 
DWriteUncommitted operation may require 
several writes to EEPROM. EEPROM is typ- 
ically written block by block, where the block 
size depends on the particular EEPROM. So 


the second signals clause states the atomic- 
ity of the DwriteUncommitted operation! 


When doing runtime assertion checking, 
any invocation of DWwriteUncommitted which 
throws a CardTearExceptionand which does 
not establish the exceptional postconditions 
will produce an error message. Throwing an 
exception that is not a CardTearException 
will also produce an error message, as there 
are no signals clauses allowing other excep- 
tions to be thrown. 


Everything the runtime assertion checker 
does could be programmed by hand, as tests 
in the code — the C implementation has a 
number of these tests scattered through the 
code ~, but note that for something like the 
second signals clause above this is far from 
trivial! It would involve catching and re- 
throwing exceptions at the end of the method, 
as well as somehow recording the ‘old’ value 
that DRead(tag) has in the pre-state. The 
JML runtime assertion tool compiles all this 
into the code automatically, which is useful, 
as it means we can concentrate on the essen- 
tials. 


The other three write-operations - 
DWriteFirst, DWriteCommittedAddGen, and 
DWriteCommittedMaxGen - have specifica- 
tions very similar to the one discussed above. 
The only difference is in their preconditions. 


The specification of DWriteUncommitted 
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above is still incomplete. For example, it does 
not specify that the older generations of the 
tag are left unchanged. Still, specifications 
like this turn out to be detailed enough to 
give useful feedback when checking them at 
runtime. As discussed below, several prob- 
lems with the implementation came to light 
when performing runtime assertion checks. 


Bug 1 — Uncommitting new generations 


Performing tests of the Transacted Mem- 
ory the runtime assertion checker immedi- 
ately reported that DWriteCommittedAddGen 
and DWriteCommittedMaxGen do not estab- 
lish their postconditions; more specifically, 
they fail to establish 


ensures !ddata[tag.value] .committed; 


The implementations of these methods forget 
to reset the committed flag of the tag. This 
bug was not discovered using SPIN, because 
the test harness used there committed every 
new generation immediately after the write 
operation. 


Note that even in the Java/JML model we 
could have forgotten this postcondition, and 
then we would not have discovered the prob- 
lem either. However, by systematically writ- 
ing specifications for all the operations we be- 
lieve one is less likely to forget something like 
this. 


Bug 2 - Inadvertent commit 


After Bug 1 was repaired, a second bug was 
discovered by runtime assertion checking. We 
also repaired the SPIN model and re-ran the 
model checker on that, and found the same 
error there. 


The operations DWriteCommittedAddGen 
and DWriteCommittedMaxGen start a new, 
uncommitted, generation, but a card tear at 
a certain point in their execution may inad- 
vertently commit the new generation written. 
Both DWriteCommittedAddGen and DWrite- 
CommittedMaxGen first write the data for the 
new generation. This may take several atomic 


writes, but the last of these implicitly records 
that the whole write has been successful (in 
effect, making the whole writing of the data 
atomic). Then the commit flag is cleared - 
also atomically, but separate from the last 
write for the data. If a card tear occurs im- 
mediately after the data is written, but before 
the commit flag is cleared, the tag will appear 
committed to the recovery process, whereas 
in reality it should be uncommitted. The 
recovery process was not designed to detect 
this, and indeed a warning to this effect ap- 
pears in the original Z specification [2, page 
34]. 


The solution which we have implemented is 
to usenot a boolean commit flag, but a three- 
valued flag, so that a DWriteCommitted- 
AddGen or DWriteCommittedMaxGen inter- 
rupted at the precise point above can be de- 
tected during recovery. (An alternative solu- 
tion would be to store the last of the data 
and the commit flag together in the same 
EEPROM block, as opposed to storing them 
in separate areas, so that writing the last of 
the data and the clearing the commit flag be- 
comes one atomic operation.) 


Optimisations and Improvements in 
the Algorithm 


In addition to finding the bugs above, the 
systematic analysis of the code required to 
write the JML specifications also had the ben- 
efit of suggesting several optimisations and 
improvements to the code. 


Efficiency Improvements 


The method DGeneration(Tag tag) discov- 
ers the generation indices associated with a 
tag, and then returns the indices of the old- 
est and newest generation, as well as the num- 
ber of generations. To better understand the 
implementation of this method, it was anno- 
tated with JML assert clauses. An assert 
clause can occur anywhere in a method body, 
and specifies a property that should hold at 
this point in the program. When doing run- 
time assertion checking, any violation of an 
assert clause will produce an error message. 
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Annotating the implementation of 
DGeneration(Tag tag) with assert 
clauses, we discovered that one for-loop 
could be removed, as the value it computed 
could already be computed directly from 
values already known. 


Also, a redundant modulo operation % (i.e. 
one where the first argument will always be 
smaller than the modulus) was discovered in 
the implementation of DGeneration. 


Interface Improvements 


The four operations for writing to the Trans- 
acted Memory are: 


e DWriteFirst 
e DWriteUncommitted 
e DWriteCommittedAddGen 


e DWriteCommittedMaxGen 


These operations have identical postcondi- 
tions, and only differ in their preconditions. 
This raises the question whether it is not bet- 
ter to have a single method DWrite, which 
chooses the ‘right’ write operation and exe- 
cutes it. Indeed the original Z specification 
offers such a ‘comprehensive’ write operation, 
defined by way of a schema conjunction of the 
write operations listed above. However, this 
operation was forgotten in the development 
of the C cum SPIN code. 


An unsatisfactory feature of the Transacted 
Memory as originally implemented in C is 
that if there is insufficient space to perform 
a write operation, it may be carried out only 
partially, resulting in an inconsistent state, 
without any warning. The informal specifica- 
tion of DWriteFirst in Table 2 does indeed 
say that its effect is undefined if insufficient 
space is available. The same can happen in 
the other write operations, although their in- 
formal specifications do not say this. 


Our initial JML specifications for the write 
methods, e.g. the one in Figure 2, did not al- 
low for this, and the runtime assertion checker 
warned about violations of them. 


We improved the Java implementation so 
that a Out0fTransact edMemoryException is 
thrown in case insufficient space is avail- 
able to perform a write operation. The 
JML specifications were adapted accord- 
ingly. For example, in the specification for 
DWriteUncommitted in Figure 2 we added 


signals 
(Out0fTransactedMemoryException) 
DRead(tag) .equals(is) && 
! ddata[tag. value] .committed; 


stating that the write operation won't happen 
at all in case an OutOfTransactedMemory- 
Exception is thrown. 


Similarly, the operation DNewTag was 
adapted to throw an OutOfTagsException 
when no additional tag is available, rather 
than producing an undefined result in this 
case. 


4.1 Future Work with these JML 
specs 


We also translated the abstract Z specifi- 
cation given in [6] to Java/JML. This was 
not difficult, given that JML comes with 
a package org.jmlspecs.models that pro- 
vides Java implementations of all the stan- 
dard mathematical concepts used in the Z 
specification. For example, Figure 3 gives the 
JML translation of the Z specification of the 
operation ACommit shown in Section 2. 


One obvious difference is that the Z spec- 
ification looks prettier, as in Java/JML we 
do not have conventional mathematical nota- 
tion, such as € or #. 


A more important difference is that the 
JML/Java specification can be turned into an 
executable one, namely 


public void ACommit (Tag t) 
{ committed = committed.insert (t); 


} 


We could use this Java implementation of the 
abstract specification to give a more detailed 
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//@ import org. jmlspecs.models.*; 


/*@ requires 


assocs.domain().has(t) && 


@ ! assocs.apply(t) .isEmpty() ; 


@ ensures 
Q*/ 
public void ACommit(Tag t) 


committed.equals( \old(committed) .insert(t)); 


Figure 3: JML specification of ACommit 


specification for our current Java implemen- 
tation. Basically, the idea would be to define 
a Java implementation which executes the 
current Java implementation and this more 
abstract one side by side, and express the 
relation between the two in JML assertions. 
However, as the abstract specification does 
not consider the possibility of card tears, the 
precise relation between this abstract imple- 
mentation and the current Java implementa- 
tion is not trivial to make precise. This is left 
as future work. 


5 Conclusions 


The work described in this paper, i.e. 


e developing a Java implementation based 
on a C implementation, and 


e developing JML specifications based on 
a Z specification, and 


e checking the Java implementation 
against the JML specification using 
runtime assertion checking, 


has been successful in finding bugs and im- 
proving the implementation. The bugs we 
found range from simple typos to more se- 
rious errors, and to some misunderstandings 
between different people that have been in- 
volved in the design of the Transacted Mem- 
ory. 


It is disappointing that the careful develop- 
ment of the system as reported in our previ- 
ous paper [6] — starting from a formal abstract 
Z specification that was refined to an C/SPIN 
implementation, which was model-checked — 
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did leave these bugs in the final implementa- 
tion. 


In all fairness, we must admit that the orig- 
inal testing scenario for the C/SPIN imple- 
mentation with the model-checker SPIN was 
too restricted. Conventional testing of the C 
implementation would have discovered many 
of the bugs that we found, but probably with 
more effort. Runtime assertion checking of 
JML specifications makes it easier to locate 
bugs than conventional testing. Indeed, no 
complicated testing scenarios were needed to 
find any of the bugs discussed. 


Some problems and possible improvements 
were found before we even tried runtime as- 
sertion checking, but were spotted when try- 
ing to come up with good specifications in 
the first place. Annotating Java code with 
JML specifications provides a systematic way 
of performing a thorough code review, which 
can help to discover bugs and may point to 
possible optimisations or improvements. By 
contrast, testing of the code may find the 
bugs, but will probably not suggest optimi- 
sations or improvements. 


There is a fairly standard recipe for anno- 
tating Java code with JML. Typically, one 
starts by giving pre- and postconditions for 
each method; these can be based on existing 
informal specifications, on our informal un- 
derstanding of the program, and - somewhat 
exceptionally here — on the formal Z specifica- 
tions. For each method implementation one 
then informally checks that any method in- 
vocations it contains do not violate their pre- 
conditions; this may require further strength- 
ening of its precondition, or the introduction 
of loop invariants. Then one compares the 
different pre- and postconditions that have 
been written. Commonalities between pre- 
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and postconditions may suggest class invari- 
ants. Differences between them may point 
out possible omissions; e.g if the precondi- 
tion of DWriteUncommitted (Tag tag) re- 
quires a tag to be uncommitted, then its post- 
condition should probably state whether this 
tag remains uncommitted or not, and pos- 
sibly other methods that have a tag as ar- 
gument should be specified with similar con- 
ditions. Finally, any violations of assertions 
found during runtime assertion checking in 
test scenarios may of course lead to improve- 
ments in the JML specifications. 


For the system we considered, a vital ad- 
vantage of using Java over using C is that 
we can conveniently model card tears using 
Java’s exception mechanism. A disadvantage 
of using Java instead of C is that C is prob- 
ably closer to a realistic implementation in 
actual hardware. 


Using Java and JML, rather than C and Z, 
for implementation and specification, has had 
several advantages. 


Firstly, it becomes possible to check the 
relation between implementation and speci- 
fication: runtime assertion checking tells us 
where Java implementation and JML specifi- 
cation disagree. This may of course just as 
well be a mistake in the Java implementation 
as a mistake in the JML specification. 


Secondly, Java implementation and JML 
specification are close together, in the same 
file. The usefulness of this is illustrated by 
the fact that the Z specification actually dis- 
cusses the possibility of bug 2, but in a foot- 
note on page 34 of [2], something one is not 
likely to notice or remember when working on 
the C implementation. 


Finally, the JML specifications are a lot 
easier to understand than the Z specifica- 
tions, except for experts in Z. JML mainly 
uses Java notions and notations, and it has 
been the overriding design principle in the 
design of JML that specifications should be 
easy to understand by any Java programmer. 
Indeed, a point we would like to stress is that 
formal methods need not involve notations 
and tools that only specialists can use. Our 
formal model is a Java program, that can be 


understood by anyone familiar with Java, as 
can the formal specifications for it written in 
JML. In this respect, it is interesting to note 
the contrast with Z and SPIN - or indeed 
UML! Developing the kind of JML specifica- 
tions we discussed in this paper and using the 
runtime assertion checker should not pose any 
problem for competent Java programmers. 
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Abstract 


The paper describes a framework for model checking 
JavaCard applets on the bytecode level. From a set 
of JavaCard applets we extract their method call 
graphs using a static analysis tool. The resulting 
structure is translated into a pushdown system for 
which the model checking problem for Linear Tem- 
poral Logic (LTL) is decidable, and for which there 
are efficient model checking tools available. The 
model checking approach of the paper is tailored to 
the analysis of inter applet (intra card) communi- 
cations and we demonstrate it using a prototypical 
example of a purse applet and a set of loyalty ap- 
plets. 


1 Introduction 


Smart cards have come to play an ever increasing 
role in our lives. We use them in electronic bank- 
ing, to keep health care data, for mobile telephony, 
and in many other applications. The most impor- 
tant aspect of smartcards is their security; users and 
card issuers have to agree that the level of security 
provided by a smartcard platform is enough to pre- 
vent malicious agents from abusing their trust in a 
card application. 


Since the number of smartcard applications is grow- 
ing rapidly, it is natural to provide smartcards with 
the possibility of accommodating multiple applica- 
tions, and the possibility to delete or add new appli- 
cations after the card has been issued. Furthermore, 


*The research has been conducted within the VerifiCard 
project with financial support from the IST programme of 
the European Union. 


such multi-application smartcards allow partner ap- 
plications to cooperate and exchange data. Popular 
applications of multi-application cards are partner 
loyalty programs, mobile telephone to banking part- 
nership programs, etc. The JavaCard platform [12] 
is one platform for building such multi-application 
smartcards. It is based on a subset of Java tailored 
to the task of embedding on a smartcard. The cur- 
rent standard omits many of the features of Java 
such as concurrency through threads, garbage col- 
lection, and many API functions but has a notion 
of applets to support multiple applications. 


One important aspect which distinguishes multi- 
applet JavaCards from single-applet ones is the sup- 
port for inter-applet communication via method 
calls. Communication naturally comes at a price: 
applets must guard against illicit invocations of 
their public methods from unwarranted applets, and 
from leakage of data to third parties. Even if a 
multi-applet application were to be proved safe, 
there still exists the possibility of new unsafe ap- 
plets being loaded onto the card post—verification. 
The JavaCard platform provides features to par- 
tially address these security concerns. Apart from 
a Java-style byte code verifier, which in the cur- 
rent generation of JavaCard smartcards is typically 
located off-card, there is a concept of a communica- 
tion firewall that by default prohibits applets from 
communicating with each other. To enable commu- 
nication to flow between applets, a recipient applet 
has to explicitly permit calls from the caller applet. 


Such checks as above are static in nature, e.g., 
method calls are always allowed, or they are never 
allowed. The work reported here in contrast permits 
to begin to characterise the temporal restrictions 
of inter-applet communications. In the formulation 
of such restrictions we consider a situation when a 
set of applets have been loaded onto a smartcard, 
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and formulate properties in Linear Temporal Logic 
(LTL) regarding inter-applet communications (in 
addition to properties about intra—applet method 
calls and API usage). 


To provide a semantic bridge between multi-applet 
programs and the temporal logic specification lan- 
guage, we use the abstract notion of a program 
graph, capturing the control flow of programs with 
procedures/methods, and which can be efficiently 
computed. The behaviour of such program graphs 
is defined through the notion of pushdown systems, 
which provide a natural execution model for pro- 
grams with methods (and possibly recursion), and 
for which completely automatic model checkers for 
LTL exist. 


In more detail the model checking proceeds as fol- 
lows. First the method call graphs of a set of JavaC- 
ard applets are obtained using a Java byte code 
analysis tool {13] developed at INRIA Rennes, which 
we have adapted for JavaCard. The analysis is per- 
formed on a class basis. As a consequence individ- 
ual applet instances cannot be reasoned about; cor- 
rectness properties concern activation of methods of 
classes extending the JavaCard Applet class, rather 
than activation of methods of an applet instance. 
Further details and limitations of this static analy- 
sis procedure are discussed in Section 2. 


The resulting method call graphs are translated into 
pushdown systems, a natural execution model for 
programs with recursion. Essentially a pushdown 
system is a pair of a control location with a stack 
of stack symbols. In our encoding we use a single 
control location and let the stack symbols represent 
the program points of the underlying JavaCard ap- 
plets. The details of the translation are elaborated 
in Section 3.1. 


For pushdown systems the model checking proce- 
dure for Linear Temporal Logic (LTL) is decidable 
and of polynomial complexity in the size of the sys- 
tem [3, 9, 7]. The atomic predicates of the logic, 
tailored to JavaCard, are the program points them- 
selves and predicates expressing class and package 
membership of program points. The Moped model 
checker [8] is used to check LTL properties of push- 
down systems. Sections 3.2,3.3 and 3.4 describes 
the logic and our use of the Moped tool in further 
detail. 


To motivate and demonstrate our approach we have 
selected a prototypical JavaCard example: a purse 





88 


CARDIS '02: 5th smart Card Research & Advanced Application Conference 


applet stores money, and interacts with loyalty ap- 
plets on receiving a purchase order. A loyalty ap- 
plet can have agreements with other applets, and 
can thus in turn communicate with another applet 
on receiving information about a purse transaction. 
In Section 4 we demonstrate the effectiveness of our 
approach in analysing such inter-applet communi- 
cation patterns. 


There exists by now a growing number of related 
work concerning model checking Java (or JavaC- 
ard), or more general formal analysis of JavaCard 
applications; below we will mention a few of them. 


The Compaq Extended Static Checker for Java 
(ESC/Java) [14], developed at the Compaq Sys- 
tems Research Center (SRC), is a programming tool 
for finding errors in Java programs. ESC/Java in- 
cludes an annotation language with which program- 
mers can express design decisions using light-weight 
specifications. Checking is neither sound nor com- 
plete, but can yield informative warning messages!. 
A case study in the context of JavaCard, based on 
the Gemplus purse applet, is presented in [5]. 


The first version of the Java PathFinder [10], JPF, 
was a translator from a subset of Java 1.0 to 
PROMELA, the programming language of the Spin 
model checker. A similar translator tool from Java 
to PROMELA (actually the variant of PROMELA 
for the dSpin tool) is reported in [11]. The Java 
Pathfinder tool is especially suited for analyzing 
multi-threaded Java applications, where normal 
testing usually falls short. The tool can find dead- 
locks and violations of boolean assertions stated by 
the programmer in a special assertion language. A 
second version of the tool reportedly works directly 
on bytecode and has support for garbage collection?. 


The Bandera Project [6] aims to develop tech- 
niques and tools for automated reasoning about 
Java based software system behavior, and to ap- 
ply these tools to construct high-confidence mission- 
critical software. Automated reasoning is achieved 
by (1) mechanically creating high-level models of 
software systems using abstract interpretation and 
partial evaluation technologies, and then (2) em- 
ploying model-checking techniques to automatically 
verify that software specifications are satisfied by 
the model*. 


Inttp://research. compaq. com/SRC/esc/ 
*http://ase.arc.nasa.gov/jpf/ 
Shttp: //www.cis.ksu. edu/santos/bandera/ 
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In [2] an approach is presented for checking proper- 
ties of multi-applet interactions of JavaCards based 
on associating security levels to applets and applet 
data, and to thus detect illegal flow of information 
between applets. Technically the approach requires 
building abstract models by hand from byte code, 
and then to check them automatically using the 
SMV model checker. 


Our work is related to the program verification ap- 
proach of [13] which is based on method call graphs. 
The operational semantics of the graphs, however, 
is given there directly through a set of transition 
rules (rather than through pushdown systems), and 
security properties are expressed as call-stack invari- 
ants. Following a similar program representation, a 
compositional account is given in [1], where a com- 
positional proof system for inferring temporal prop- 
erties of a multi-applet program from the properties 
of the individual applets is presented. 


2 Constructing Method Call Graphs 


We use an external static analysis tool, developed 
for a Java verification framework [13], to generate 
call graphs which abstract from everything (such 
as data variables, and parameters to method calls) 
but the presence and order of method calls inside 
method bodies. The analysis tool performs a safe 
over-approximation (with regards to preservation of 
LTL safety properties) in the sense that call edges 
may be present in the result call graph even if 
they cannot be invoked at runtime, but the oppo- 
site does not hold. For instance, when the static 
analysis cannot determine which class method is 
invoked in a method call, typically due to sub- 
typing, then a call edge is generated to a target 
method in every possible class, thus increasing the 
nondeterminism in the generated cajl graph. The 
static analysis tool generates graphs with informa- 
tion about exceptional behaviours. In this work ex- 
ceptional edges, and nodes, are translated into non- 
deterministic constructs thus effectively increasing 
the non-determinism in program behaviour in a con- 
servative fashion. 


The call graph generation is also conservative with 
respect to the JavaCard firewall mechanism, which 
is not considered during static analysis. That is, 
a method call that at runtime will fail the security 
checks of the JavaCard runtime environment. will 


nevertheless invariably be included in the method 
call graphs. 


Analysis starts from a set of JavaCard classes, which 
should include the implementation of all on-card ap- 
plets. To refine the analysis, and to permit analysis 
of JavaCard API usage, the API classes of SUN’s 
Java Card Development Kit (version 2.1.2) are in- 
cluded in the method call generation. The result of 
analysis is a set of method call graphs. 


2.0.1 Method Call Graphs 


The methods M are partitioned into classes C, 
which are themselves partitioned into packages 
P. We assume the usual Java naming conven- 
tions with fully qualified names, i.e., a class has a 
name Package. identifier and a method has a name 
Class.identifier . 


Definition 1 (Method Graph, 
from [13]). A method graph is a tuple 


adapted 


m 3 (Viva —m) Am; tm) 


such that: 


(i) Vm are the program points of m, 
(ii) +mC Vn x Vm are the transfer edges of m, and 


(iii) Am : Vm — T designates to each program point 
of m a program point type from the set T & 
{entry, seq, call, return}. 


(iv) Lm : Vm -> ((M) designates to each program 
point of type call of m a non-empty set of meth- 
ods. 


We assume the program point sets V,, to be pairwise 
disjoint. The program points of the program is the 


set V & Umem Vm- 


The program point type indicates whether (entry) 
a node is the entry point of a method, (seq) a 
node in which no method call or return takes place, 
(call) a node from which a method call takes place, 
or (return) a node in which the execution of the 
method finishes and control flow returns to the call- 
ing method. 
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For convenience, we introduce the predicates 


vit & Am(v) =t fort €T 
v:locm 4 ve Van 
v:entry m $ v:entry A v:locm 
A 
v:retunm = v:return A v:locm 
v:dasse & Jm.v:locm A méc 
uv: package p S Fe. v:classe A cep 


We further define a predicate v : api, which holds if 
the program point v occurs in a method in a JavaC- 
ard API package (for standard JavaCard this corre- 
sponds to one of java.lang, javacard. framework, 
javacard.security or javacardx.crypto). 


3 Model 
Graphs 


Checking Method Call 


3.1 Pushdown Systems 


Pushdown systems provide a natural execution 
model for programs with recursion. They form a 
well-studied class of infinite-state systems for which 
many important problems like equivalence checking 
and model checking are decidable [4]. 


Definition 2 (PDS, from [7]). A pushdown sys- 
tem (PDS) is a tuple 


P © (P,T,A) 


where: 


(i) P is a finite set of control locations; 
(ii) Tis a finite set of stack symbols; 


(iii) A C (P xT) x (PxI*) is a finite set of rewrite 
rules of the shape (p, y) -> (q, a). 


The set P x I* are the configurations of P. If 
(p,y) -> (q,@) is a rewrite rule of P, then for each 
w € I* the configuration (q,a-w) is an immediate 
successor of the configuration (p,7-w). A run of P 
is a sequence p = (po,00) (pi, 01) (p2,02)-*-, such 
that for all 7, (p;41, 0:41) is an immediate successor 
of (pj, ai). 


We now define how a set of methods M induces a 
PDS. 
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Definition 3 (Induced PDS, formalising [8}). 
A set of methods M induces a PDS 


Pp 2 (PT,A) 


as follows: 


(i) P consists of the single control location p; 
(ii) Tis the set V of program points; 


(iii) A is the set Un ea Uvev,, Prod(v), where 
Prod(v) is a set of rewrite rules defined as: 


{(p, v) a (D, v') |v Im v'} 
if v: entry or v: seq 


4 Lal APacaden { » 7 = 2 sh Y | \ 


v! : entry m’,v -4m v" 
if uv: call 


{(p, v) a (D, e)} 


if v: return 


The rewrite rules of the pushdown system can be 
interpreted as simply manipulating the calling stack 
of the program from which the PDS was obtained. 
Given a configuration c = (p,v- a) let point (c) & 
v. 


3.2 Specification Language 


Our specification language is linear temporal logic 
(LTL), with program point predicates p as atomic 
propositions but omitting the type predicate v : t. 
The choice of linear temporal logic as the specifi- 
cation language, instead of for instance the moda! 
pe-calculus for which the model checking problem 
for our encoding into pushdown systems is also ef- 
ficiently decidable, was solely motivated by the ex- 
istence of the efficient model checker Moped [8] for 
LTLe 


The operators of the logic are the standard ones. If 
@ and ware formulas then so are =¢, dAw, 6VU, tod 
and @U w. The meaning of formulas is defined with 
respect to runs of infinite length r = cocic2.... We 
let 7; denote the suffix of r starting in configuration 
ci. Then satisfaction r — @ of a formula ¢ by a run 
r is defined as: 
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rp iff point (co): p 
rE-7d iff notre @ 
rE oAW iff rE¢andrEY 
rE@Vy) if rEdorrEYy 
rEa&> iff ne&¢ 
rE oU > iff there is an i > 0 such that 


r.- pandr;E¢ 
for allO<j7 <2 


Henceforth let false abbreviate p A —p for some 
atomic predicate p, true abbreviate —false, 6 > 
aw abbreviate =¢ V yw, and next @ abbreviate 
X @ and ¢ until y abbreviate 6 U w. Fur 
ther define eventually @ & u-¢ and 
always ¢ = ~1 (eventually a¢). The weak until oper- 
ator ¢ weakuntil 7 abbreviates ¢ until 2 V always ¢. 
Finally let never ¢ = always 7@. 


true 


Given a PDS pds let the notation m | @ express 
the judgement that all runs starting in the entry 
program point of the method m satisfy ¢. More 
formally: 


Definition 4 (Model Checking a Method 
Call). Given a PDS pds with the single control lo- 
cation p and a method m, the judgement mt ¢ is 
valid iff for every run r of the PDS pds’ from the 
initial configuration (p,v+m_loop), r — @ holds, 
where v is the entry program point of method m 
(ie. v : entry m), and pds’ is the PDS pds extended 
with the fresh stack symbol m_loop and the single 
rewrite rule {p, m_loop) — (p,m_loop) to achieve 
infinite runs. 


The definition of a judgement mF @¢ is motivated 
by the Moped tool which implements an algorithm 
for checking an initial configuration against an LTL 
formula. 


3.3 Specification Patterns 


As in the Bandera project [6] specification patterns 
are used to facilitate formulating correctness prop- 
erties. These specification patterns concern tempo- 
ral properties of method invocations, and are either 
temporal patterns or judgement patterns concerning 
the invocation of a particular method. Below a set 
of patterns that we have defined, and which are com- 
monly used, are given. 


To express that within the call of a method m the 


property @ holds the judgment pattern 
Withinm¢ 2 mi ¢ 


is used. The property that a call to m, never trig- 
gers method mz can be specified as: 


my, never triggers m9 


Within m, (-(eventually loc m2)) 
Within m, (never loc m2) 


I We 


Next define the temporal patterns (formulas) (i) 
Mg after mj, i.e., M2 can only be called after a call 
to my; (ii) m2 through mj, i.e., m2 can only be 
called from m; (iii) m2 from m, i.e., m2 can only 
be called directly from mj; and (iv) m, excludes m1, 
i.e., when m, is called this excludes the possibility 
that mg will later be called; (v) p cannotCall m, i.e., 
the method m cannot be directly called from any 
method in package p. 


My after m, 


4 3 
= (never loc m2) weakuntil loc m, 


m, excludes m2 


A 
= (eventually locm,;) = never loc m 


M , from my; 
4 always (—=(loc m, V loc m2) > next loc m2) 
A rlocma2 


M2 through m,; 
aloc m2 weakuntil loc m4 
A 
= always return my => 
( next (loc m2 weakuntil loc m,) ) 


pcannotCall m 
A 


= always (package p > next —loc m) 

The intuitive idea of the formulation of m2 fromm; 
is to express that the current program point can 
be in method mz only because of a direct call from 
mj, or because it was already in mg, and initially 
the program point is not in m2. 


The above patterns can be combined with the 
Within pattern. For example, 


Within m (mg after m2) 


expresses that during a call to m; the method m3 
will be called only after calling mo. 


An alternative technique for expressing correctness 
properties of behaviours of programs of stack-based 
languages is to use stack inspection techniques [13]. 
Essentially these techniques express constraints on 
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the set of all possible runtime stacks. Note however 
that for instance the after property above cannot di- 
rectly be coded as a stack inspection property since 
the calls to m, and mg need not be concurrent. 


3.4 <A Tool for Model Checking Push- 
down Systems 


The Moped tool [8] can check a pushdown system, 
from an initial configuration, against an LTL for- 
mula where the atomic predicates consists of a set 
of atomic symbols that checks the identity of the 
top stack symbol or the control location (i.e., sim- 
ply checks name equality). In case the LTL formula 
is falsified a reduced pushdown system constructed 
from the original one, that also falsifies the LTL for- 
mula, is presented as diagnostic information. 


To represent the non-identity atomic predicates 
(e.g., package, entry,...) as “Moped LTL formulas” 
a number of options are possible. Consider for in- 
stance the package atomic predicate. A direct rep- 
resentation of the predicate in Moped LTL would 
consist of a disjunction over all the program points 
in any class in the package. 


An alternative representation strategy is to enrich 
the translation from a call graph to a pushdown 
system. Since Moped provides boolean variables 
we could represent the current package identity en- 
coded in a set of boolean variables in the pushdown 
system. These variables would then be updated for 
every rewrite rule that crosses package boundaries. 
Finally the representation of the package predicate 
itself would consist of a simple boolean condition. 


We have instead opted to extend the Moped tool 
with atomic predicates that can match a control 
location, or the top stack symbol, against a regu- 
lar expression. These predicates check the syntactic 
shape of the symbol being tested. 


Consider the naming of program points of a method 
m by the call graph construction. Its entry program 
point will be named m_entry, its (unique) return 
program point will be named m_ezit, and all other 
program points in m are of the form m_n where n 
is a natural number. 


With these conventions in place the atomic pred- 
icates can be represented in “regular expression 


Moped” as indicated below: 


A 
locm = m_.* 
A 
entym = m_entry 
return m £ m_exit 
A 
classc = c\..*_.* 
package p S p\. .*\. .#_.# 


In the encoding it is assumed that the dot symbol ‘.’ 
has to be quoted using a backslash character inside 
a regular expression to represent itself, rather than 
representing any character. 


So called wildcards can be used in a regular ex- 
pression to achieve a limited form of quantifica- 
tion over program points. The static analysis tool, 
for instance, gives the name p.c.<init> to an ob- 
ject constructor method p.c. Thus, whether the 
current program point is in any object constructor 
can be tested by the regular expression predicate 
.*\..*\,.<init>_.*. As afurther example, the api 
predicate, which recognises control points inside an 
API function, can be defined 

‘(java\.langljavacard\..*|javacardx\..*) .*’. 


4 Example 


The model checking of JavaCard applets will be il- 
lustrated with an example; a modification of the 
purse example from SUN's JavaCard Development 
Kit (version 2.1.2). This example is a prototypi- 
cal purse and loyalty smartcard application, which 
comprises around 1430 lines of JavaCard code. 


To understand the example it is helpful to recall the 
execution characteristics of JavaCard applets (lan- 
guage version 2.1.1). An interaction with the card 
(after installation of an applet, and its selection) is 
initiated through calling its process method. Inter- 
applet communications, crossing package borders, 
are controlled by the JavaCard firewall mechanism 
and take place through special interface objects. 
The methods of such interface objects are indicated 
in Figure 1. 


The purse applet keeps a balance that is updated 
upon requests from the environment. Purse trans- 
actions, whether successful or not, are logged to a 
transaction log. The operations of updating the bal- 
ance, logging the new transaction and updating the 
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package purse.Purse 


interface PurseLoyalty 


method bonusPointsT oPurse 





: implements 


class Purse 





package purse.LoyaltyA _-- 


- 


class LoyaltyA 





package purse.Loyalty 


interface LoyaltyPurse 
method grantPoints 


interface Loyaltyfoyalty 


method grantLoyaltyPeoints 
— ik. 


3 implements : 


class Loyalty 


- 





extends 






package purse.LoyaltyB 


class LoyaltyB 





Figure 1: Purse Class Diagram 


transaction number are made atomic through use of 
the transaction facility of JavaCard. 


Upon completion of a new purse transaction the 
purse applet notifies subsidiary loyalty applets via 
the interface method grantPoints. These are ap- 
plets that should be notified of the balance update 
so that they, for example, can award loyalty points. 
A concrete example is a bank smartcard with an 
embedded loyalty applet for a car rental company 
that awards bonus points for every car rented using 
the bank card. 


In addition to these functionalities there are meth- 
ods, accessible through the process method of the 
applet, for modifying most of the parameters of the 
purse applet, including adding knowledge about new 
loyalty applets that should be notified when card 
transactions occur. 


The loyalty applets of the Development Kit purse 
application do not attempt to communicate with 
other applets. We have extended the example 
loyalty applet with two new functionalities: (i) 
A loyalty applet can have agreements with other 
loyalty applets to share bonus points; to achieve 
this we introduce direct loyalty applet to loyalty 
applet communication using the interface method 
grantLoyaltyPoints. (ii) loyalty applets can have 
an agreement with the purse to transfer, according 
to same fixed rate, part of the bonus points back to 
the purse. This is achieved through calling the in- 
terface method bonusPointsToPurse of the purse. 


The modified purse and loyalties example is a re- 
warding example to study using our model checking 
approach as many key applet correctness properties 
can be phrased as properties of inter-applet commu- 
nications. 


4.1 Example Properties 


Below we list a number of properties of the purse 
and loyalty applets, formulated using our judgement 
patterns. We introduce the following abbreviations 
of the applet class names: 


purse = purse.Purse. Purse 
loyaltyA & purse. LoyaltyA.LoyaltyA 
loyaltyB 2 purse. LoyaltyB.LoyaltyB 


Property 1: there are no calls to both grant- 
Points and grantLoyaltyPoints for the same 
applet. For all loyalty applets £ it is the case 
that a call to L.grantPoints never triggers a call to 
L.grantLoyalty Points. 


$1.1 S loyaltyA .grant Points never triggers 
loyaltyA .grantLoyaltyPoints 
1.2 = loyaltyB.grantPoints never triggers 


loyaltyB .grantLoyalty Points 
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Property 2: grantPoints is not transitive. 
For all loyalty applets L and L’ it is the case 
that a call to L.grantPoints never triggers a call to 
L'.grantPoints. That is, the grantPoints method is 
neither transitive nor recursive. 


do, & loyalty A.grant Points never triggers 
loyalty A.grantPoints 

g22 = loyaltyA.grantPoints never triggers 
loyaltyB.grantPoints 

$2.3 ca loyalty B.grantPoints never triggers 
loyalty A.grant Points 

g24 = loyaltyB.grantPoints never triggers 


loyaltyB.grant Points 


Property 3: grantLoyaltyPoints is not 
transitive. The same as Property 2, but for 
grantLoyaltyPoints. 


Property 4: grantLoyaltyPoints is called only 
through grantPoints. That is, within all purse 
methods m accessible from outside the card, the 
method L.grantLoyaltyPoints of a loyalty applet L 
is called only through a call to L’.grantPoints of an- 
other loyalty applet L’ and never directly by the 
purse applet. 


ga = 
Within m 
loyalty A.grantLoyalty Points through 
loyaltyB.grantPoints 


a 


42 = 
Within m 
loyaltyB .grantLoyalty Points through 
loyaltyA .grant Points 


Property 5: Bonus point are awarded at most 
once within a transaction. Transfer of bonus 
points from a loyalty to the purse does not cause 
further bonus points to be awarded. 
@s 
Within purse.bonusPoints ToPurse 
package purse.Purse V api 


That is, calls to the bonusPointsToPurse method 
does not cause a context switch to any other applet 
package (but possibly to the JavaCard Runtime En- 
vironment — JCRE). 


The previous correctness properties were specific to 
certain applications whereas the following express 
properties that can be beneficial for any JavaCard 
applet. 


Property 6: no constructors called. For all 
applets A it is the case that no constructor method 
is called within a call to A.process. This can be 
a crucial property for applets due to the absence 
of garbage collection in standard JavaCards. Let 
constructor express the regular expression predicate 
.*\..*\.<init>_.* which tests whether the cur- 
rent location is in a constructor method. 


A mee 8 
$6.1 = Within purse.process —constructor 
‘NS 
$62 = Within loyaltyA. process sconstructor 
$6.3 2 Within loyaltyB.process —constructor 


This property holds of the loyalty applets, but not 
of the purse applet which can create a new object 
during a call to the process method (without bad 
consequences, due to conditions involving data). 


Property 7: recursion freeness. For all non- 
API methods m it is the case that a call to m never 
triggers another call to m. 


A ; 
go; = m never triggers m 


The elapsed time to construct the set of call graphs 
from the example classes was approximately 16 sec- 
onds on a Linux workstation with a Pentium III 
1.9 GHz CPU and 256 MB of memory. The re- 
sulting call graphs, which includes API program 
points, consists of 2034 nodes and 3747 edges. The 
pushdown system generated from these call graphs 
has approximately 1200 production rules. To check 
the pushdown system against each of the formulas 
above, given an initial configuration, took less than 
one second on the same computer hardware as used 
for call graph generation. 


5 Conclusions and Future Work 


The paper proposes a framework for automatic 
model checking of temporal constraints on inter- 
applet communications in multi-applet JavaCards. 
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The framework has been realised by combining a 
class-based static analysis tool with an automatic 
model checker for pushdown system and linear tem- 
poral logic. 


In the future we will refine the static analysis to 
permit the analysis of communication capabilities of 
single applets thus connecting to the work on com- 
positional proof systems for JavaCard applets sug- 
gested in [1]. This will permit us to analyse whether 
an applet can operate safely on a smart card even 
when the knowledge about other applets on the card 
is imperfect. 


Further information regarding the model 
checking framework and_ tthe availabil- 
ity of the tool components and _  exam- 
ples can be obtained at the web location 


http: //www.sics.se/fdt/projects/VeriCode/. 
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Abstract 


The access control exercised by the Java Card 
firewall can be bypassed by the use of shareable 
objects. To help detecting unwanted access to 
objects, we propose a static analysis that calcu- 
lates a safe approximation of the possible flow 
of objects between Java Card applets. The anal- 
ysis deals with a subset of the Java Card byte- 
code focusing on aspects of the Java Card fire- 
wall, method invocation, field access, variable ac- 
cess, shareable objects and contexts. The techni- 
cal vehicle for achieving this task is a new kind 
of constraints: quantified conditional constraints, 
that permits us to model precisely the effects of 
the Java Card firewall by only producing a con- 
straint if the corresponding operation is autho- 
rized by the firewall. 


1 Introduction 


The Java Card language is a subset of Java, 
tailored to the limited resources available on to- 
day’s smart cards. Java Card keeps the essence of 
Java, like inheritance, virtual methods, overload- 
ing, but leaves out features such as large prim- 
itive data types (long, double and float), 
characters and strings, multidimensional arrays, 
garbage collection, object cloning, security man- 
agers [I, 10]. Given the security-critical ap- 
plication areas of Java Card, the language has 
been endowed with an elaborate security archi- 
tecture. A priori, applets are separated by a frre- 
wall which prevents one applet from accessing 
objects owned by another applet. Thus, even if 
a foreign applet obtains a reference to an object 
with confidential informationthis does not imply 
that the information is leaked. In order to pro- 


vide a means of communication between sepa- 
rated applets, objects can be marked as shareable. 
This allows to grant access to (a subset of) the 
methods of the objects through the firewall. The 
problem is that marking an object as shareable 
means that its shared methods can be accessed by 
all applets that manage to get a reference to the 
object. To counter this problem, Java Card of- 
fers a limited form of stack inspection, allowing 
a “‘server” applet to know the identity of a “client” 
object which invoked a particular method. This, 
however, must be programmed explicitly by the 
application programmer. These mechanisms (de- 
scribed in detail in section 2) allow the design of 
secure applications but do not themselves guar- 
antee security. Further code analysis must be em- 
ployed to establish that the checks programmed 
in the server applet guarantee that confidential 
data is not leaked via shared objects. To sum- 
marize: 


The Java Card firewall can be bypassed 
by using shareable objects. Data flow 
analysis permits to calculate a safe ap- 
proximation to the access control actu- 
ally implemented by a set of applets, 
and thus to verify that a given access 
policy is respected. 


This paper presents a flow analysis for Java 
Card programs. The analysis is constraint-based 
in that for each instruction of the program it gen- 
erates a set of constraints describing the data flow 
of the instruction. The resolution of this system 
permits to find the possible values of the vari- 
ables used in the program and the called method. 
The analysis relies on a novel technical device, 
quantified conditional constraints (QCCs), that 
allows to generate the set of constraints of a 
program on demand. This way of generating 
constraints is useful and natural when analyzing 
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object-oriented languages where the control flow 
and the data flow are inter-dependent. It general- 
izes the conditional constraints proposed by Pals- 
berg and Schwartzbach [20] for object-oriented 
type analysis. 


The paper is organized as follows. Sec- 
tion 2 introduces the central features of the Java 
Card 2.1.1 firewall and provides a detailed ex- 
ample. Section 3 defines our representation of 
the Java Card bytecode. The abstract domains 
used in the analysis are given in Section 4 and 
Section 5 defines the set of quantified conditional 
constraints generated for each type of instruction. 
Section 6 shows how these QCCs can be solved 
iteratively and Section 7 shows how the analysis 
performs on the example from Section 2. Sec- 
tion 8 and Section 9 discuss related works and 
directions for extending this work. 


2 The Java Card firewall 


The Java Card platform is a multi-application 
environment in which an applet’s sensitive data 
must be protected against malicious access. In 
Java, this protection is achieved using class load- 
ers and security managers to create private name 
spaces for applets. In Java Card, class loaders and 
security managers have been replaced with the 
Java Card firewall. The separation enforced by 
the firewall is based on the Java Card’s package 
structure (the same as Java’s) and the notion of 
contexts (in Java Card, this notion is called group 
context). 


When an applet is created, the Java Card Run- 
time Environment (JCRE) assigns it a unique ap- 
plet identifier (AID). If two applets are instances 
of classes coming from the same Java Card pack- 
age, they are said to belong to the same context, 
identified by the package name. In addition to the 
contexts defined by the applets executed on the 
card, there is a special “‘system” context, called 
the JCRE context. Applets belonging to this con- 
text can access objects from any other context on 
the card. Thus, the set of Java Card contexts is 
defined by: 


Java Card contexts = 
{ JCRE } & { pckg: a package name } 


Every object is assigned a unique owner con- 
text viz., the context of the applet which created 
the object. A method of an object is said to ex- 
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ecute in the context of its owner!. It is with this 
context that the JCRE determines whether an ac- 
cess to another object will succeed. The firewall 
isolates the contexts in the sense that a method 
executing in one context cannot access any fields 
or methods of objects belonging to another con- 
text. 


There are two ways for the firewall to be by- 
passed: via JCRE entry points and via shareable 
objects. JCRE entry points are objects owned by 
the JCRE that have been specifically designated 
as objects accessible from any context. The most 
prominent example is the Application Protocol 
Data Unit (APDU) buffer in which commands 
sent to the card are stored. This object is man- 
aged by the JCRE, and in order to allow applets 
to access this object, it is designated as an entry 
point. Other entry points can be the elements of 
the table containing the AIDs of the applets in- 
stalled on the card. Entry points can be marked as 
temporary. References to temporary entry points 
cannot be stored in objects (this is enforced by 
the firewall). 


Two applets in different contexts may want to 
share some information. Java Card offers a shar- 
ing mechanism, called shareable objects, that 
gives limited access to objects across contexts. 
An applet can allow another applet to access an 
object’s methods from outside its context. The 
mechanism is restricted to methods and cannot 
be applied to fields. It uses a shareable inter- 
face, that is an interface which extends java- 
card.framework.Shareable. In this in- 
terface, the applet gives the list of the method’s 
signatures it wants to share. The class of the ob- 
ject to share must implement this interface. The 
“server” applet defines a method, getShare- 
ableInterfaceObject, called when an ap- 
plet is asked to provide a shared object. The 
method receives the AID of the “client” applet 
which requested the shared object. Based on this 
information, the server decides what to return to 
the client, thus it is possible to share different ob- 
jects with different client applets. 


2.1 Anexample using shareable objects 


Figure | contains an example illustrating the 
sharing mechanisms of the firewall. We have 3 
applets: Alice, Bob and Charlie. Alice imple- 


‘In the case of a static call, the execution is in the caller's 
context. 
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ments a shareable interface MSI (we assume an 
interface MSI that extend Shareable in which 
the signature of the method foo is given) and is 
prepared to share an object MSIO (an instance of 
theclass that implements the interface MSI) with 
Bob. When Alice receives a request for sharing 
(via a call to her method get S107) by the JCRE, 
she verifies that the caller is Bob. If itis Bob, she 
returns MSTO else she returns Null. 


Bob can ask for a shareable object from Al- 
ice using the JCRE method getASIO?. Assume 
now that Bob (inadvertently) leaks a reference 
to MSIO to the third applet Charlie. Since the 
firewall only checks that the object is shared be- 
fore granting access, Charlie can invoke the same 
methods of the MSIO object as Bob. Alice knows 
this so she decides to verify, at each access to one 
of her shared methods, the identity of the caller. 
Java Card offers a method for obtaining the AID 
of the context in operation before the last con- 
text switch, here called getPrevCxt*. Using 
this information Alice can discover when applets 
from contexts other than Bob’s attempt to access 
the MSIO object. 


2.2 Limitations of the firewall 


The Java Card firewall has several shortcom- 
ings, as analysed in detail by Montgomery and 
Krishna [18]. One potential difficulty with the 
Java Card firewall is that shareable objects can be 
accessed by any applet and not only by the applet 
to which the reference was given, as illustrated 
by the example above. Since references can be 
passed from one applet to another, this opens up 
the possibility for methods in shared objects to 
be invoked by applets other than those for which 
they were intended. To protect applets against un- 
wanted access, Java Card offers a limited form 
of the stack inspection mechanism that under- 
lies the Java 2 security architecture. The sys- 
tem method get PrevCxt can be called to get 
access to the last context switch that took place. 
When a method is called from another applet, this 
context switch indicates the identity of the caller. 
This information can then be used to decide what 





21n reality, this method is called getShareableiIn- 
terfaceObject and is invoked by the JCRE that mediates 
all requests for shared objects. 

3In reality, the method JCSystem.getApplet- 
ShareableInterfaceObject. 

4In reality, this method is called JCSys- 
tem.getPreviousContextAID. 


value the method should return to the caller. It is, 
however, up to the programmer to implement this 
correctly. If the security mechanisms provided 
by the language are not used properly, unwanted 
information flow can arise as a result of objects 
flowing from one applet to another. In order to 
verify the access control actually implemented by 
a set of Java Card applets we have developed a 
static analysis that calculates, for each variable in 
a program, an approximation of the set of values 
that will be stored in this variable. This static ap- 
proximation allows 


e to signal potential data flow between applets 
that violates a given access control policy, 


e or, if no such flow is detected, to provide a 
proof that all data flow respects the policy. 


The analysis is based on a constraint-based type 
analysis for Java-like languages, but is modified 
to keep an accurate account of the Java Card 
specificities (like context and firewall). Indeed, 
since the security of an applet to a large extent 
relies on the use of the getPrevCxt method, 
the analysis must be able to model calls to this 
method precisely. 


3 A representation of Java Card 
bytecode 


To simplify the presentation, we work with a 
“three-address” representation of Java Card byte- 
code where arguments and results of an instruc- 
tion are fetched and stored in local variables in- 
stead of being popped and pushed from a stack. 
This format is similar to the intermediate lan- 
guage Jimple used in the Java tool Soot [23] and 
the transformation of code into this format is 
straightforward. We furthermore assume that the 
constant pool has been expanded i.e. that indices 
into the constant pool have been replaced by the 
corresponding constant. For example, the byte- 
code instruction invokevirtual takes as pa- 
rameter the signature of the method called, rather 
than an index into the constant pool. The for- 
mal representation of Java Card bytecode can be 
found in [17]. 
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public interface MSI extends Shareable ( 


Secret foo (}; 


public class Alice extends Applet :rpienents MSI “we 
private Secret ObjectSecret; 
public Shareable getSIO (AID Client) { 
if (Client.equals (BobAID)) 


return (this); 
return null; } 


public Secret foo {) { 


AID Client; 
Secret Response; 


Client = getPrevext{}; 
if (Client.equals (BobAID)) 
Response = ObjectSecret: 


return Responte; 


public class cb extends Applet { 
public static MSI AliceObj; 
public void bar () { 
AliceObj = (MSI) getASsIO 
{AliceAIDJ; } } 





public class Charlie extendas Applet 


private static MSI AliceObj; 
private etatic Secret AliceSecret; 
public void bar () { 
AliceObj = Bob.Alicedbj; 
AliceSecret = Alicecbj.foo {4}; 


Figure |: Example of shareable objects 


3.1 Notations 


The term P(X) denotes the power set of X: 
P(X) = {S| SC X}. A product type X = 
A x B x C is sometimes treated as a labeled 
record: with an element x of type X, we can ac- 
cess its fields with the names of its constituent 
types (x.A, x.B or z.C). A list is defined by 
enumeration of its elements: r; :: --+ :: Zn. List 
elements can be directly accessed giving their po- 
sition (v(i) for the i** element). Lists can be con- 
Cabened? (a. ¢,-== 1 ye) a5, Ss Py) = 
BY We PR SB SO: X* denotes the 
type of finite lists, whose elements are of type X. 
The symbol — is used to form the type of partial 
functions: X > Y. The  € E notation denotes 
the formula v; € Ey A---Aun € En. 


3.2 Abstract syntax 


Our program representation is a modified ver- 
sion of that of Bertelsen [5, 6]. We use /dp, /d¢:, 
Ids and Id,,, to denote the set of qualified name of 
a package, of aclass or an interface, of a field and 
of a method, respectively*. Jd, is the set of (un- 
qualified) names of variables. To extract name 
information from an identifier, we use the nota- 
tion [/d|*, where /d is a qualified name and x the 
type of the projection®. We assume a set AID 


5The qualified name of an entity is the complete name. For 
a class, it is p.c where p is the name of the package and c the 
(unqualified) name of the class. For a method (c.71) or a field 
(c. .f)it is the qualified name of the class and the (unqualified) 
name of the method or field. 

To extract a (unqualified name), we use p {or a package, 
c fora class or an interface, #: for a method and / for a field. 
To extract a qualified name, we combine the symbols so, for 
example, [/d]?:¢ will extract a qualified name of a class (or 
interface) from the qualified name /d. 


which contains the possible applet identifiers of 
the applets installed on a card. This set contains 
a special AID, written JCRE, for the Java Card 
Runtime Environment. 


Classes and Interfaces A class or an interface 
descriptor consists of a set of the access modifiers 
(P(Mod,;)), the name of the class or interface 
([dei), the name of the direct superclass or the 
names of direct super-interfaces (Ext), the name 
of the interfaces that the class implements (mp), 
the name of its package (/dp), field declarations 
(Fld), method declarations and implementations 
(Mtd). A class must have one superclass, the de- 
fault being java.lang.Object, but an inter- 
face can have zero or more super-interfaces. Only 
a class can implement an interface, so for an in- 
terface this set is empty. The fields are described 
by a map from field names (/d,) to a pair consist- 
ing of a set of access modifiers (P(Mod f)) and a 
type descriptor (Type). The type of a field is ei- 
ther a primitive type (boolean, short, byte, 
int) or the name of aclass or an interface. All of 
this information are stored in the class hierarchy 
(Ec;). 


Methods The methods are described by a map 
that to a method signature (Sig) associates a 
method descriptor (Desc,,). This structure con- 
sists of a set of access modifiers (P(Mod,,)), the 
code of the method (Code), a description of the 
formal parameters (Param), optionally a descrip- 
tion of the variable used to return a value (Res) 
and the local variables of the method (Varl). A 
signature is the name of the method (/d,,,) and the 
list of type descriptors for its parameters (Type*). 
Code is a list whose elements consist of a pro- 
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gram counter value (Pc’) and the instruction at 
this address (Bytecode). The set of local vari- 
ables is the list of all variable names (Jd,) with 
their type descriptor (Type). 


Bytecode Due to space limitations, in this pa- 
per, we only consider a subset of Java Card byte 
code. The subset is nevertheless sufficient to il- 
lustrate the different features of our analysis; see 
[16] for a treatment of the full language. In the 
following, T; range over local variables and S; is 
used to give the list of the type of the parame- 
ters for a call (which can be found in the constant 
pool). 


The main departure from standard bytecode 
is the introduction of the construct ifAID T € 
S BCinst. This specialized if-instruction takes 
as argument a variable 7 that contains an AID, 
a set S € P(AID) and executes the instruction 
BCinst ifthe AID belongs to the set S. We have 
introduced this instruction to make explicit how 
the analysis takes information about AIDs into 
account. Ordinary bytecode can be transformed 
to use the if AID instruction by identifying those 
conditional instructions that make test of the 
form Aid € S. Most of such tests are syntactically 
explicit in Java Card source programs or can 
be identified by simple intra-procedural flow 
analysis. 


Bytecode = ifAIDT € S BCinst | BCinst 


The Java Card bytecode is transformed into a 
“three-address” like language. We will not de- 
scribe this program transformation any further. 
BCinst = 

T:=getstaticf 

| T):=invokeinterfacemT, 
T2 +++ Tn Sati: SniiSn41 
T:= invokestatic getPrevCtx 


T, := load T2 
putstatic fT 


| 
| 
| T:=new/d, 
| 
| 


T, -= store T2 


T: =getstatic floads the value contained in 
the static field f of the class [f]?°¢ and stores it in 
T. Tg: =invokevinterface mT, T2:-: Tn 
Soit+++2:8,:Sn41 invokes the interface method 
m with the signature So::--+:7S,41 on the ob- 


7We assume furthermore a set Pe of program counters. 
A program counter identifies an instruction within the whole 
class hierarchy and not just a method. 


ject contained in 7, with parameters T2 --- Ty 
and the result is stored in the variable 7p with 
type Sn41. 77 =invokestatic getPrevCtx re- 
wieves the AID of the last active context before 
the last context switch and stores it in T. 71 := 
load Ty» loads the value contained in T2 and 
storesitin 7,. 7: =new C stores areferenceto the 
object created at this program point in T. put- 
static fT loads the value contained in the vari- 
able T and stores it in the static field fof the class 
[f]?**. Ty := store T> loads the valuecontained 
in T> and stores it in 7}. 


3.3 Auxiliary functions on the class hier- 
archy 


We define three predicates to determine if a 
class member (the second parameter) is visible 
from a given instruction (the first parameter). 
We have CI_Visibility? for a class or an in- 
terface, Method_Visibility? for a method and 
Field_Visibility? for a field. We must keep this 
test in the constraint because in some cases, like 
for the modifier protect, we need information 
about its dynamic values. 


CI_Visibility?: 

Id, X Ide, X E-; -> Boolean 
Method_Visibility ?: 

Id, x Id, x Descm X Ei; -> Boolean 
Field_Visibilit y?: 

Id. x Id x Ec; -+ Boolean 


The function Lookup models the dynamic 
search of methods underlying the virtual method 
calls. It takes as arguments the signature of a 
method, the class in which the method is de- 
clared, the class in which the invocation are made 
and the class hierarchy. It returns a set of fully 
qualified method names of the implementations 
of the method designated by the signature. 


Lookup: Sig x Idei X Idei x Eci -> P(ldm) 


A full description of the Java visibility rules 
and method resolution would be quite lengthy 
due to the non-trivial semantics of these two lan- 
guage features. We refer instead to the litera- 
ture [12, 15, 14]. 
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4 Abstract domains 


Owners and contexts An object is owned by 
an applet (or the JCRE) thus an owner is uniquely 
identified by an AID. Since an AID does not di- 
rectly specify the package to which the applet be- 
longs, we add this information for convenience. 
Thus, the set of object owners is defined by: 


Owner = Idp x AID 


We define an abstract context to be an abstrac- 
tion of the call stack in which a method is exe- 
cuted (these contexts should not be confused with 
the Java Card notion of context). Our abstract 
contexts are designed to provide exactly the in- 
formation that can be obtained by a call to the 
stack-inspecting method get PrevCxt (cf. Sec- 
tion 2). More precisely, the abstract context in 
which a method m is analyzed consists of a pair 
(Prev,App) where the first component Prev is the 
last active Java Card context before the last con- 
text switch and the second component App is the 
Java Card context of the caller (i.e., the active 
context that invoked m). Formally we define: 


Context = Owner x Owner 


Values Weare primarily interested in modeling 
the object structure and ownership so we abstract 
primitive values such as booleans and integers to 
their type. To model the heap of objects, we adopt 
a common approach (going back to at least [13]) 
in which all objects created by the same new in- 
struction are identified by one object. We refine 
this by keeping the owner as part of the abstract 
object. More precisely, a reference (Ref) to an 
object (Obj) is abstracted into the instruction that 
created the object and the owner of the object. We 
suppose we have a special Null reference. 


Ref= (Pc x Owner) \t { Null } 


We have three kinds of abstract values: ref- 
erences, applet identifiers and primitive values 
which as mentioned above are abstracted by their 
type. 


Value = Ref AID W 
{boolean, short, byte, int } 


Concerning the concrete value in memory, we 
can have a class instance (Obj) which contains 
the name of the class (/d,;), the owner of this in- 
stance (Owner), boolean flags indicating whether 
or not it is a JCRE entry point or a temporary 
JCRE entry point (cf Section 2) and the set of 


fields (Fldv), a function which maps a field name 
to a set of values. 


Obj = 

Id; x Owner x JCREep x tJCREep x Fldv 
Fidv = 

Ids — P(Value) 


Firewallchecks The checks made by the fire- 
wall are formalized through a collection of pred- 
icates. Covering all bytecode instructions would 
require eight different predicates ([16]); in this 
paper, we only use two of these predicates: 


e The predicate AccessInterface? validates the 
access to methods of an object. 


AccessInterface?: 
Ref x Ref x Id; x E-; — boolean 


The first reference represents the current 
context, the second represents the object on 
which the call is made and /d; is the name 
of the interface which declared the method 
called. The access is authorized if and only 
if the context represented by the first refer- 
ence is the context of the JCRE or if the con- 
texts of the two references are the same or 
if the second reference represents a JCRE 
entry point or if the class of the object repre- 
sented by the second reference implements a 
shareable interface and Jd; extends a share- 
able interface. 


The predicate AccessPutstatic? checks the 
validity of the access to a static field of a 
class. 


AccessPutstatic ?: 
Ref x Value — boolean 


The reference represents the current context 
that wants to store the value in the static 
field. The access is only authorized if the 
Java Card context represented by the refer- 
ence is the context of the JCRE or if the 
value is neither a global object nor a tem- 
porary JCRE entry point. 


5 Flow analysis 


In this section we describe a data flow analysis 
to approximate the part of a program’s behaviour 
relevant to security verifications. The main infor- 
mation calculated by our analysis is an approxi- 
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mation of the objects stored in the variables of the 
program. More precisely, we calculate the fol- 
lowing information: 


V [[varm,ctx] € P(Value): the set of values 
stored in the variable var of method mm when 
this method is called in context ctx. 


e SF [Id.i] : Ids -+ P(Value): the possible 
values of the static fields of a given class. 


e mem: Ref —+ Obj: an approximation 
of the memory in which an abstract refer- 
ence of form (pc, owner) is mapped to an 
abstract object that safely approximates all 
those concrete objects allocated by instruc- 
tion at address pc and owned by owner. 


e C [m,ctx] € P(Ref): the set of objects on 
which a call to method mm in context ctx is 
made. 


It is important to analyze methods for each 
calling context since this is the information avail- 
able to the firewall at run-time. An analysis that 
does not exactly model this information would 
have poor precision. This information serves two 
purposes: it permits constructing a control flow 
graph (by resolving which method is called at a 
given virtual method call) and it makes explicit if 
an object owned by an applet is stored in a vari- 
able accessible by another applet. 


An intra-procedural analysis is required in or- 
der to approximate the behaviour of each server 
applet when it receives a request for a shared ob- 
ject. This analysis is orthogonal to the analysis 
presented in this paper and will not be described 
here. We shall assume the function: 


Return.S1O: AID X AID — P(Ref) 


It takes the AID of aserverand the AID ofa client 
and returns a safe approximation of the set of ob- 
jects that the server accept to share with the client 
(the set that it returns is equal to or bigger than the 
set returned during the execution). 


5.1 Quantified conditional constraints 


The analysis will be specified in constraint- 
based style. We introduce a new type of con- 
straints, the quantified conditional constraints 
(QCCs) that can be considered as a constraint 
scheme from which actual constraints can be gen- 
erated. 


The first kind of constraints used in static anal- 
ysis is the simple constraint (SC). It is used to 
model the flow and the modification of informa- 
tion. A simple constraint has the form: 


Expression C Variable 


An extension of this kind of constraint was used 
by Palsberg and Schwartzbach [20] for type anal- 
ysis. They takea simple constraint and add a con- 
dition under which the constraint is valid. Such a 
conditional constraint has the form: 


Class € Variable; -+ Expression C Variableg 


The Variable2 have Expression as possible value 
if and only if Class is a possible value for 
Variable;. The simple constraint models an in- 
struction of a method and the condition model the 
fact that this method can effectively be called. 


This kind ofconstraintssolves the problem that 
the constraints to be generated depend on the ac- 
tual data flow of the program. The solution has 
the drawback that it has to generate all possible 
constraints from the outset and then test for each 
iteration and for each constraint whether it should 
be taken into consideration. In the following, we 
propose to generate the constraints set in an incre- 
mental fashion where constraints are only added 
once the data flow analysis has actually estab- 
lished that the constraints will be activated. 


We propose to extend this kind of constraints 
in the following two ways: 


e allow more conditions, to model, for exam- 
ple, the activities of the environment like the 
firewall checks or the visibility rules, 


e produce dynamically the system based on 
the current value of each variable (instead of 
generating constraints forall possible values 
of the domain of the variable). 


This new kind of constraints is called quantified 
conditional constraints and has the form: 


Vvi,--*4Un E€ Sis #2593 
cond(v, Fae 1Un) wis 
cstr(vj,--*,Un) 


Here, cstr is a set of simple constraints param- 


eterized on v1,---,uUn and cond are conditions 
on the values vj,---,Un. Evaluation of such a 
QCC results in a set of constraints for each value 
V1,°°+,Un € 5S1,---+, Sy Satisfying the condition 





USENIX Association 


CARDIS '02: 5!) Smart Card Research & Advanced Application Conference 


103 


cond. In our analysis, the QCCs have a particular 
structure, as shown below. 


e The set S, used in the quantification, can 
be the set of possible values of a variable 
(V [x,m,ctx]), the set of objects on which 
a call is made (C [,ctx]), the result of the 
Lookup or a constant set. 


The condition cond is a conjunction of con- 
ditions. It can be a test on the visibility, a 
firewall check or a test for membership of a 
constant set. 


e A constraint const is a set of simple con- 
straint SC. SC have a form: Exp © Var. Exp 
can be a variable, a constant set, a derefer- 
encing of the memory, the set of the values 
of a static field or the call to Return_S/O. 
Var can be a variable, a dereferencing of the 
memory or the set of the values of a static 
field. 


OCC: VvalueeS: 
cond(value) —> 
cstr(value) 
S: V [Lymetx] | C [m,ctx] | Const Set | 
Lookup (Sig, Ideis Id ci, Eci) 
cond: H, \---/\Hy 
Condition (H): 
CL.Visibility? (Id¢,ld¢i) | 
Method _Visibility? (Id¢,lde,Descm) | 
Field Visibility? (Ideld ) | 
AccessInterface? (Ref.Ref1d;) | 
AccessPutstatic? (Ref, Value) | 
value € Const Set 
cstr: P(SC) 
Constraint (SC): Exp © Var 
Exp: Const Set | V [x,m,ctx] | $F [/dci](Id s) | 
€ [m,ctx] | mem(Ref).Fldv(Id ;) | 
Return.SIO (AID,AID) 
Var: V [x,mctx] | SF [dei](Ids) | C [mctx] | 
mem( Ref). Fldv(Id +) 








5.2 Analysis 


The analysis generates, for each method and 
for an execution context ctx, a set of QCCs that 
describes the data flow of the method in this con- 
text. The set of constraints for a method is the 
union of the set of constraints for each instruc- 
tion. The function to analyze an instruction is: 


A inst: Inst x Idj, x Context — P(QCC) 


This function takes three parameters: the in- 
struction to analyze, the current method, and the 
context in which the method is analyzed. An in- 
struction is just a program counter and the byte- 
code instruction at this address. In the following 
we define this function for each bytecode instruc- 
tion. 


getstatic The getstatic instruction loads a 
value stored in a static field of a class or interface 
and stores it into a local variable. The value in 
the field C.fis stored in the local variable T if and 
only if the field exists and the field is visible at 
instruction /nst (figure 2). 


invokeinterface The invokeinter£ace in- 
struction makes a call to an interface method. We 
calculate the set of methods to which the method 
signature sig can be resolved 

Lookup (sig,mem(o).Type,[p|?°,Eci) 
together with the context in which the meth- 
ods called will be analyzed (Prev,App). If the 
call is accepted by the firewall (Accessinterface? 
(r:0,[p| P-¢ E.,)), we add constraints to simulate 
this call. We create constraints to simulate the 
transfer of the actual parameters to the formal pa- 
rameters: 

V [T;,m,ctx] CV [Pig.(Prev,App)]}, 
and add a constraint to retrieve the value returned 
by the method called 

V [To,m,ctx] D V [R,q,(Prev,App)]. 
Finally, we add the object o in @ [g,(Prev,App)] 
to indicate that the method q was invoked on this 
object (figure 3). 


foad The load instruction loads value con- 
tained in a variable and stores it in an other vari- 
able. The values contained by the variable 72 are 
transfered into the variable 7; (figure 4). 


new Thenewinstruction simulates the creation 
of a new class instance and stores a reference to 
it into a variable. If the class is visible by the 
instruction, we store in V [T.m,ctx] the reference 
to the created object (figure 5). 


putstatic The putstatic instruction stores a 
value in a static field. The value contained in vari- 
able T is stored in the static field f of the class 
[/|”’¢ if the field is visible by the instruction and 
if the firewall accepts this access (figure 6). 
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> Fi Visibility? T)dQAcis J , Bct 
Ainst ((pce,T := getstatic f), m, ctx) = Wie ee eo Gy debe Ea) 


Figure 2: Getstatic 
Anse ((pc,To := invokeinterface pT; To --- T, So: --- 2 Spt: Spay), m, ctx) = 
V(r,0,q) € C[m, ctr] x VTi, m, ctx] x Lookup(sig, mem(o).Type, [p]?'°, Ec:) 
: AccessInterface?(r, 0,[p|?’*, Eci) 


{V[Ti,m, ctr] C V [Pi 9, ctr’], 


V[Ta,m,ctz) CV [Pr,g,ctz'], 


~ 4 Init-Var(Eei([q]?°).Mtd(([q]™, So : ++ 2 Snti)).Varl,q, ctx’) 
C [9, ctz’] D {o} 
V [To, m, ctz] D VIR, ¢,ctzr’] J 
where we have used the following abbreviations: 
sig =([p]™,S2 +--+: Sa) 
Py +++ Pa = (Ees([q]?'°).Mtd)((q, S2 = +++: Sn)).Param 


ctr’ = (Prev, App) 

App = (mem(r).Owner.Id,, mem(r).Owner) 
ctz.Prev if ctr.App.Id, = App.Id, 
ctz.App otherwise 

R= (Eei([ql?'*).Mtd)((q, S2 : +++: Sp)). Res. Id, 


Figure 3: Invokeinterface 


Prev= 


Ainst ((pc,T; := load T2), m,ctx)= — {V[Ti,m,ctz] 2 V[T2,m, ctx] } 
Figure 4: Load 
—_ _ V(r) € C[m,ctz}: C]_Visibility?(mem(r).Idei,c, Eei) 
Anse ((pe,T >= new c), m, ctx) = — {V [,m, ctz] 2 {(pc,r.Owner)} } 


Figure 5: New 


Ainst ((pe,putstatic f T), m, ctx) = 
V(r,v) €C [m, ctr] x V[T, m,ctz] 
: Field_Visibility?(mem(r) dei, f, Bei) A Access Putstatic? (r, v) 
— {SF EP FP If) 2 {v}} 


Figure 6: Putstatic 


A inst ((pe,Ti := store T2), m, ctx)= — {V[Ti,m,ctz] 2 V [T2,m, ctx] } 


Figure 7: Store 


Ainst ((pc,T := invokestatic getPrevCtz), m, ctx) = 
V(r) € C Im, cta] : ctz.App.[d, = mem(r).Owner Id, 
— {V[T,m,ctz] D ctz.Prev.AID } 


V(r) €C[m, ctz] : ctz.App.Idp #4 mem(r).Owner.Id, 
— {V[T,m,ctz] 2 ctz.App.AID} 


Figure 8: getPrevCtx 


Let A mse (pe, BC inst), m, ctx) = 6 € E: cond + {C}. Then 
Amst ((pc,ifAID T € S BCinst), m, ctx) = 
V'(d,a)€ Ex VIT,m,ctz]:condAaeS— {C} 


Figure 9: ifAID 


Figure 10: Examples of @CCs 
SS SS SSS 
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store The store instruction stores the value 
contained in variable T> in variable T;. This data 
flow is modeled by a simple set inclusion: values 
contained in variable Tz may also be contained in 
variable 7, (figure 7). 


getPrevCtx The instruction invokestatic 
getPrevCtx makes a call on the static method JC- 
System.getPreviousContextAlD. The method get- 
PrecCtx serves to find the AID of the active ap- 
plet before the last context switch. The first con- 
straint is activated when the active context is the 
context of the caller, in which case they have the 
same previous context. The second one is ac- 
tivated when the active context differs from the 
context of the caller. In that case the previous 
context is the context of the caller (figure 8). 


ifAID The QCC used in this construct is the 
one analyzed for the BCinst instruction. A con- 
dition is added such that the constraints are only 
generated if the condition in the test is true (fig- 
ure 9). 


6 Resolution 


The resolution of quantified conditional con- 
straints can be done iteratively as an ordinary fix 
point computation. The main difference with a 
“classic” system is that the set of constraints and 
the values of variables in the constraints evolve 
together. Hence, the iteration sequence consists 
of triples (qcc,sc,val) where qcc is the current set 
of quantified conditional constraints instantiated 
for particularcontexts, sc is the current set of sim- 
ple constraints and val is a valuation that to each 
variable associates its current value. 


Suppose that we have a program P consisting 
of a set of applets (Ap/r) and a set of methods 
(Meth). Let Q be the set of (uninstantiated) OCCs 
obtained by analyzing P (with functions A ctass 
for a class or an interface, Ayer for a method 
and A;,,5¢ for an instruction). During the resolu- 
tion of Q, we compute the new set of instantiated 
QCCs, P(QCC), with the function Eval gcc, the 
new set of simple constraints SC, P(SC), with the 
function Evalsc and the new valuation Val with 
the function Eval vai, as defined below. 


Alues Propagations 





The function Evalgcc uses the current val- 
uation to instantiate the QCCs in the set Q and 
adds the corresponding constraints to the current 
set of constraints. This is where the resolution 
becomes context-sensitive: if a method is not 
called in a particular context, no constraints for 
this method will be generated in that particular 
context. 

Evalgcc: 
P(QCC) x Val — P(QCC) 
Eval gcc (qcc, val) = 
qcc U 
0 €C[mcrx] 
i { ctr| Actx = CalcCrx (o,ctx,val) } 


m & Meth A ctr © Ameth (mctx ) 
ctx € Centext 


where the function for calculating the context of 
thecall is given by 


CalcCtx: 
Ref x Context x Val — Context 
CalcCtx (rc,v) = (Prev, App) 
where 
App = (v(mem))(r). Owner 
c.Prev if cApp.ldp= App.Idp 


aoe c.App otherwise 


The function Evalsc uses the current valua- 
tion to verify the condition for each constraint in 
the set of instantiated QCCs and adds the corre- 
sponding simple constraints to the current set of 
constraints. This evaluation permits to restrict the 
production of the simple constraints that model 
the effect of an instruction that “executed”. We 
use the notation (Exp]v to denote the evaluation 
of the expression Exp with the values contained 
by the valuation V. 


Evalsc: 

P(QCC) x P(SC) x Val — P(SC) 
Evalsc (qcc,sc,val) = 

scU 

'% € X : cond — ctr € gcc 
{ ctr(w/E] | NB € [X]ouar } 
A condfwzp 
The function Eval ya is the standard evaluation 


function associated to a constraint set. For ev- 
ery constraint exp C var in the current constraint 
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set cs we evaluate the expression with the current 
valuation and add the new value in val(var). 


Evalyat: 
P(SC) x Val —+ Val 
Eval vat (sc,val) = 
val{var + val(var) U [exp]vat! 
with 
exp C var € sc 


ALGORITHM 

Q = Uae apie Actass (A) > 

qece’ :=Actass (JCRE)(scre,JCRE) 3 

sc, sc’, gcc := 9; 

Va lie=vlb: 

val’ := valo® ; 

while qcc # qcc’ or sc # sc’ or val # val’ do 
qcc :=qcc’ ; sc :=sc’ ; val := val’ ; 
qce’ := Evalacc (qcc,val) ; 
sc’ := Evalsc (qcc,sc,val) ; 
val’ := Eval ya (sc,val) ; 

endwhile 

END 


Proposition 6.1 This algorithm terminates with 
acorrect solution to Q. 


The proof of Proposition 6.1 is an extension 
of the standard argument based on Tarski’s the- 
orem [24, 11]. The specificity of the proof is 
to take into account that the system evolves (in 
a monotonic fashion!) during the computation. 
The formal proof (termination and correctness) 
can be found in [16]. 


Establishing a start state for the iteration re- 
quires special attention in Java Card because 
there is no main to initialize the analysis. The 
sequence of operations is given by the JCRE and 
the user. We model this interaction with the card 
by adding an artificial JCRE applet that is ana- 
lyzed like the others. For the JCRE we know its 
context (it is (JCRE,JCRE)) which permits the 
algorithm to produce the initial set of instantiated 
QCCs. The initial valuation valg links each ele- 
ment with its default value. For each V [x,m,ctx] 
and @ [m,ctx] the default value is 0. For each 
SF [Id_i] the default value is the function which 
links each static field of Jd-; with its default value 
(@ for a reference and {P} for a primitive P). Fi- 
nally, we initialize the abstract memory (ment) 
with the undefined abstract objects for each ab- 
stract reference. 


8The definition of the initial value valo comes after the 
algorithm. 


7 Anexample analysis 


In figure 11, we present a variation of the ex- 
ample given in section 2.1, in which the firewall 
and Alice can not prevent the flow of the Alice 
secret to Charlie. Here, Bob implements a share- 
able object and passes a reference to it to Charlie. 
In this case, the invoke at Alice.foo is valid at run- 
time, because for Alice the caller is always Bob. 
Here, we only present the transformation of this 
example in our language in the figure 12. The 
constraints are neither generated nor solved auto- 
matically yet, but we work on an implementation 
of the previously presented algorithm. During the 
resolution, each ‘“‘variable” received the possible 
values that it can contain. In this example, the im- 
portant value is the secret of Alice (represented 
by the reference (p, Alice AI D)) and the impor- 
tant variable is the static field AliceSecret of 
Charlie. The resolution gives, as a part of the 
global solution, the following possible value for 
the static field of Charlie: 


(p, AliceAID) € 
SF [Charlie] (Charlie. AliceSecret) 


This result proves that there is an illegal object 
flow with the secret of Alice. 


8 Related works 


The formalization of the Java Card firewall has 
been the object of several works. Motré [19] has 
formalized the firewall with the B method. She 
defines a machine for the firewall and an opera- 
tion for each check of the firewall. This modeling 
provides a formal description of the firewall that 
is used to ensure that the firewall verifications are 
sufficient to fulfill the security policy. In addition, 
successive refinements lead to a reference imple- 
mentation of the firewall. More traditional opera- 
tional semantics for modeling the firewall checks 
have been given by Eluard et al. [17]. Siveroni et 
al. [22] show how to integrate this into an opera- 
tional semantics for Java Card. For the modeling 
of the JCRE itis necessary to be able to “execute” 
the differents applets. We choose to follow the 
approach used by Attali et al. [3, 4] and model 
the JCRE by an applet. With this approach, we 
can adapt the JCRE to obtain either exactly the 
execution we want or all possible executions. 


The problems related to the Java Card fire- 
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public cless Bob extends Appiet 
implements MSI2{ 
private static MSI AliceObj; 
private void bar () { 
AliceObj=(MSI) getSIO (AliceAID); } 
public Secret foo2 () { 







return AliceObj.foo (): ) } 


public class Charlie extends Applet 
private stetic MSI2 BobObj; 
private static Secret AliceSecret; 
private void ber () 
BobObj=(MSI2} yetSIO (BObAID); } 
private void foo3 « { 
AliceSecret=BobObj .feo2 ); ) ) 















Figure 11: An example of illegal object flow 


Public class Alice extends Applet implements MSI { 


private Secret Object. Secret; 
public Secret foo ¢) { 

AID Client; 

Secret Response; 


1:T;: =invokestatic getPrevCxt 


2:Client:=etore T; 


3:i£AID Client € ({BobAID} T 2: =getetatic Alice.ObjectSecret 


4:Response: =store T2 


5:Alice.fooRet:sload Response 


return Alice. fooRet 
public class Bob extends Applet 
implements MS12{ 

private static MSI AliceOb}; 
public Secret foo2 {) { 

6:T3:=getatatic Bob.AliceObj 

7:T4:sinvokeinterface MSI.foo Tg 

&:Bob.foo2Ret:=store Tq 

return Bob. foo2Ret 


public class Charlie extende Appiet { 
private static MSI2 BobObj; 
private static Secret AliceSecret; 
private void foo} 0) { 
9:Tg :=getstatic Charlie.Bobobj 
10:Tg:sinvokeinterface MSI2.foo2 Ts 
11 :putstatic Charlie.AliceSecret Tg} ) 





Figure 12: The translation of the three methods of the example in our language 


wall have been observed by others, notably Mont- 
gomery and Krishna [18], who propose another 
approach to secure object sharing based on dele- 
gates. A server implements a delegate object that 
mediates access to those methods that the server 
wants to share with others. The delegate object 
performs the checks that it deems necessary to 
grant access. This approach is more flexible than 
the existing firewall but has the drawback that it 
requires (minor) changes to the JCVM. This tech- 
nique permits to use more sophisticated authenti- 
cation mechanisms than the one based only on 
AID comparison. In the paper it is shown how 
to use a protocol based on challenge/response 
phrases to avoid the problem of AID spoofing. 
However, no technique is presented for proving 
that delegates indeed do respect a given security 
policy. In contrast, our approach works for the 
standard JCVM and relies on static analysis to 
check that no unwanted access takes place. 


Two works on the verification of applet sharing 
on Java Card are closely related to ours. Bieber et 
al. [8, 7], as part of the Pacap project [2], have de- 
fined an analysis of Java Card applets which can 
detect illegal information flow. Their approach is 
based on three elements: an abstraction of values 
of variables into a level that describes the sharing 
of the value, an invariant that is a sufficient con- 
dition the security property to hold and a model 
checker to verify the invariant. A lattice of lev- 
els is used to represent the sharing of objects. If 
an applet A is allowed to share some information 
with an applet B, the level A+B is entered into the 


lattice specifying the security policy. Each applet 
is represented by a call graphand each call graph 
is transformed into an SMV model. To work with 
a shareable object, an applet must call an inter- 
face method so only call graphs which include an 
interface method are taken into account. The in- 
variant together with the control flow graphs are 
given to the SMV model checker for verification. 
The work presented here complements their work 
by providing a precise description of how these 
control and data flow graphs can be calculated, 
taking into account the firewall and the different 
calling contexts. 


The analysis proposed by Caromel, Henrio and 
Serpette [9] has as aim to signal whether a secu- 
rity exception might (or will definitely) be raised 
by the firewall at execution of a set of applets. 
The analysis thus shares objectives with ours and 
calculates the same type of information. The dif- 
ferences between the analyses lie in the precision. 
Caromel et ai. have opted for a simple, flow- 
insensitive analysis whereas we can obtain some 
flow sensitivity through the choice of local vari- 
ables in our three-address byte code. Instead of 
modeling the memory state explicitly, they use 
an alias analysis to track side effects of assign- 
ments. The control flow analysis in their analysis 
is a simple class hierarchy analysis, in contrast to 
our context-sensitive flow analysis. Indeed, their 
analysis does not analyze methods separately for 
each calling context and hence would not be able 
to deal with the call stack inspection as well as 
our analysis. Thus, the two analyses can be seen 
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as two extremes of the design space for flow anal- 
ysis for Java Card. 


The quantified conditional constraints (@CCs) 
introduced in Section 5.1 are an extension of 
the conditional constraints (originally due to 
Reynolds [21]) that are used in the object- 
oriented type analysis defined by Palsberg and 
Schwartzbach [20]. In this analysis, conditions 
of the form C € V(X) are used to guard the con- 
straints generated from class C’ such that these 
are only evaluated when class C is actually used. 
However, it is still necessary to generate the con- 
straints for every class in the hierarchy which 
leads to scalability problems. The QCCs, on 
the other hand, generate these constraints on de- 
mand: only when the analysis discovers that a 
certain class or method is used, the corresponding 
constraints are generated and added to the current 
set of constraints. 


9 Conclusions and future work 


The access control exercised by the Java Card 
firewall is bypassed when invoking methods on 
shareable objects. In order to determine the ac- 
cess control that is implemented by a given set of 
Java Card applets we have presented a static anal- 
ysis that calculates a safe approximation of the 
flow of objects between applets of a Java Card 
application. The static analysis is an extension 
of the constraint-based program analysis frame- 
work that allows to generate and solve data flow 
constraints in a demand-driven fashion. 


The information calculated by our analysis has 
other applications than verifying access control. 
The data flow information allows to construct a 
precise control flow graph on which other safety- 
style properties of the application can be verified. 
Examples of these include verifying that all Java 
Card transactions are well-formed and that ex- 
ceptions arc properly caught and treated by the 
application. A verification technique based on 
model checking using finite automata is detailed 
in [16]. 


The present analysis does not deal with the 
problem of (indirect) information flow between 
applets. In particular, we do not model the flow 
of primitive values between applets so we can- 
not detect if applet B transfers data to applet C 
that contains information obtained from applet 
A. Analyses for detecting such information flow 


have been proposed elsewhere (see e.g. [25]) in 
the setting of a simple imperative language. The 
control and object flow information calculated by 
our analysis can be used to adapt such analyses to 
the Java Card languagebecause it allows to elimi- 
nate the higher-order and object-oriented features 
of an application, essentially translating it into an 
imperative language. This requires an improve- 
ment to the abstract domains such that owner in- 
formation can be attached to primitive values and 
primitive operations must be adjusted to calculate 
the possible owners depending on the values used 
in the operation as well as the applet which does 
the operation. 


Finally, for the moment the analysis does not 
take into account exceptions other than security 
exceptions. With the current abstraction of the 
primitive values it is clear that exceptions related 
to e.g., array access (index-out-of-bound excep- 
tions) can only be dealt with in a very approxi- 
mate fashion. Exceptions form an integral part of 
the control-flow of an application so progress in 
this direction is desirable. 
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Abstract 


This paper addresses mobile code protection with respect to potential integrity and confidentiality violations originat- 
ing from the untrusted runtime environment where the code execution takes place. Both security properties are 
defined in a framework where code is modeled using Boolean circuits. Two protection schemes are presented. The 
first scheme addresses the protection of a function that is evaluated by an untrusted environment and yields an 
encrypted result only meaningful for the party providing the function. The second scheme addresses the protection of 
a piece of software executed by an untrusted environment. It enforces the secure execution of a series of functions 
while allowing interactions with the untrusted party. The latter technique relies on trusted tamper-proof hardware 
with limited capability. Executing a small part of the computations in the tamper-proof hardware extends its intrinsic 


security to the overall environment. 


1 Introduction 


The mobile code paradigm is becoming increasingly 
praised for its flexibility in the management of remote 
computers and programmable devices. Unsurprisingly, 
more flexibility leads to new challenging security prob- 
lems. Mobile code presents vulnerabilities unheard of in 
the traditional programming world. On one _ hand, 
attacks may be performed by mobile programs against a 
remote execution environment and its resources. On the 
other hand, a mobile code may be subverted by a mali- 
cious remote execution environment. The former issue 
has been widely addressed [25], for instance through 
containment mechanisms like the sandbox, the applet- 
firewall, etc., but few solutions deal with the latter. 


This paper extends our work on the protection of 
mobile code [23], [24]. The problem addressed here is 
as follows: Alice (A) wants a piece of code to be exe- 
cuted on Bob’s (8) workstation, the result of its execu- 
tion being eventually returned to A. However, B cannot 


be trusted and might try to modify the execution of this 
program. In addition, in the context of mobile code, 
interacting with A during the program execution is not 
an option. In other words, the code sent by A must be 
executed autonomously by B who only provides addi- 
tional input parameters. It is necessary to ensure that B 
cannot get information about the semantics of the code 
provided by A and that A can be assured, without per- 
forming the computation herself, that the execution has 
not been tampered with. This is different from volunteer 
computing scenarios [34] such as Seti@home where 
data to treat is provided by A. 


In this model, “integrity of execution” means that B 
cannot alter the execution of the program and surrepti- 
tiously modify its results. “Confidentiality of execu- 
tion”, sometimes termed “‘privacy of computation” 
although it bears no relationship with anonymity, aims 
at preventing the disclosure of program semantics. 
Integrity and confidentiality are tightly entangled in our 
proposal because of the cryptographic protection 
scheme used. However, confidentiality and integrity are 


CARDIS '02: 54 Smart Card Research & Advanced Application Conference 


11] 


independent properties and thus will be evaluated sepa- 
rately in each solution. 


The use of tamper-proof hardware (TPH) has long 
been the only solution for protecting the execution of 
critical programs from an untrusted party: the program 
is completely executed within the hardware that is 
trusted by the code owner. With the advent of mobile 
code, TPH has logically been advocated as the most 
obvious solution for protecting a program from its 
untrusted execution environment. [43] is a good exam- 
ple of this hardware-only trend. However, existing solu- 
tions based on tamper-proof hardware suffer from 
inherent limitations ranging from the cost and difficulty 
of retrofitting tamper-proof and powerful cryptographic 
boards on everybody’s workstations to the lack of com- 
puting power in smart cards. 


Prompted by the limitation of TPH-based solutions, 
alternative approaches were brought up as application- 
specific solutions [10], solutions aimed at protecting 
specific classes of mathematical functions [33], [32], or 
even empirical and mathematically unfounded ones like 
obfuscation [16]. Our proposal also takes into account 
the inherent limitations of tamper-proof hardware. 
Mathematical functions, which can be represented by 
Boolean circuits, are a building block for programs. Sec- 
tion 2 presents a scheme ensuring a secure non-interac- 
tive evaluation of such functions. To this end, the circuit 
implementing the function is encrypted using a tech- 
nique inspired by the McEliece public key scheme [28]. 
Based on this solution, Section 3 describes a scheme for 
the secure non-interactive evaluation of a piece of soft- 


enciphered | 


function f function f’ 


1) 


E, | 
Gato) ||» Gra) 1 _j_e 
an | 
| 


5) 
[va] 4) | 
cleartext output y = 
a 
= f(x) << : 


2s oa 


ware consisting of the combination of several functions 
and of a control structure scheduling these functions. In 
this case, a Tamper-Proof Hardware acting on behalf of 
the party providing the mobile code is required. Execut- 
ing a small part of the computations in the TPH extends 
the intrinsic security of the TPH to the overall environ- 
ment. In Section 4, this solution is compared with simi- 
lar approaches dealing with integrity or confidentiality 
of execution. 


2 Protecting Functions 


As a first step towards integrity and confidentiality of 
execution, a solution for protecting mathematical func- 
tions is proposed. This solution is inspired by the work 
of Sander and Tschudin [33], [32], who devised a func- 
tion hiding scheme for non-interactive protocols (see 
Section 4.1). 


Figure 1 describes the main steps of the function pro- 
tection process using this solution. Using £4, functionf 


is encrypted by its originator A into a new function f’. 
The untrusted host B evaluates f’ on the cleartext input x 
and gets y’ as the encrypted result of this evaluation. 
Using the secret decryption algorithm Dy, A can retrieve 


y that is the cleartext result of the original function f, 
based on the following property of the function hiding 
scheme: y = Da(y') = Da(f'(x)) = Da(Eg(fx))) = fx). 
Moreover, an integrity verification algorithm V, is used 
by A to check the computation performed by B. 


B (untrusted) 

gen a a a a 
| Lx] 
| 

| 2) 3) 

| 

! Y 

= f’(x) 


enciphered output y’ 


cleartext input x 


Figure 1: Evaluating an encrypted function on an untrusted host . 
1) the function is encrypted; 2) the encrypted function is sent to the untrusted host (confidentiality); 3) the encrypted function 
is evaluated with cleartext inputs; 4) the encrypted result is sent back to A; 5) the result is verified (integrity); 6) Decipherment 
is performed to obtain the cleartext result. 
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xX—~> Y’ 
(0,1 }!— > (0,7 


x ~>Y 
(0, 1}'—> (0, 1}* 


Figure 2: Construction of the encrypted 
Circuit c’. 


2.1 Computational Model 


This section defines the mechanisms required to ensure 
integrity and confidentiality of execution in an untrusted 
environment. Section 2.2 describes more precisely how 
those concepts are implemented in our approach. 


General Overview 


Since fixed-length inputs and outputs are used, it is pos- 
sible to deal with functions using a Boolean circuit rep- 
resentation. Let us represent the function f with a circuit 
called c (in the sequel of this paper, circuit and function 
are largely used as equivalents). The number of binary 
inputs (/) and outputs (k) will be defined according to 
the possible input and output values of the function. X is 
the unrestricted set of all possible inputs (e.g. /0,/ 7 ). 
F;, represent the family of Boolean circuits with / 


inputs and k outputs ( XY ). Circuit 
defines a relation between input xe X and output 
y = {e(x)|(x,€ X)}, EY, YC {0,1}". The cir- 
cuit c may also be seen as a set of & functions: 
{Q 1} fy {0, 1} . Each of these functions is defined by 


a Boolean equation. The corresponding k equations are 
the inputs to algorithm E, (Figure 2). The result is a set 


ce Fy, 


of n Boolean equations that define a new Boolean circuit 
c'e€ F,,, . The circuit c’ defined by A is evaluated by B 
who provides input data xe X , but the encryption by 
E 4 prevents the disclosure of c to B. Look at Section 2.2 
for details on the encryption mechanism E,. 


Confidentiality of Execution 


The circuit c’=E,(c) preserves the confidentiality of c if 


it is computationally infeasible to derive c from c’. A 
decryption algorithm D, must be used in order to 


retrieve the desired cleartext result y=c(x) from the 


obtained ciphertext result y’=c’(x). A polynomial time 
decryption algorithm is necessary to remain realistic. 


Integrity of Execution 


Alice receives a ciphertext result y’ corresponding to the 
evaluation of the encrypted circuit c’ with Bob’s cleart- 
ext inputs. Alice should be able to retrieve from y’ the 
cleartext result y corresponding to the evaluation of the 
circuit c using the same input. V, has to define a polyno- 
mial time verification of this result since for practical 
applications, the circuit owner must be able to efficiently 
verify the result of the circuit execution. The verifier 
concept is introduced to address the problem of integrity 
of execution. The verifier shares some similarities with 
CS Proofs [29] in that there can exist invalid proofs but 
those should be hard to find. Basically, the verifier con- 
cept relies on the difficulty of finding valid values (y’) 
that do not correspond to valid cleartext outputs (y). 
Using the terminology of [9], Alice’s verifier V4 checks 
that there exists an x such that y=c(x). It can be defined 
as follows: 


if y¢ {c(x;)|(;€ X)} then p(Va(y') = Accept) <6 


Even if the result is verified and cannot be forged ran- 
domly, a malicious remote host is able to identify possi- 
ble outputs of a circuit for chosen inputs. Therefore, 
integrity of execution alone (i.e. without confidentiality 
of execution) does not prevent B from performing sev- 
eral executions of the circuit and selecting the best 
result. A scheme that ensures both integrity and confi- 
dentiality of execution is thus highly desirable. 


2.2 Detailed Protection Scheme 


This section presents our solution to encrypt functions. 
A technique derived from the McEliece [28] public key 
cryptosystem is used. Unlike the McEliece scheme that 
encrypts data, our approach encrypts functions. Moreo- 
ver, this asymmetric scheme is used as a symmetric one 
by keeping both public and private keys secret. As a 
result, part of the attacks possible against the McEliece 
cryptosystem are not relevant in our scheme because 
attackers do not know the public key. 


Circuit Encryption 
All Boolean equations of the original plaintext circuit c 


are encrypted using the McEliece technique [28] where 
data are replaced by equations c’ = E,4(c) : 
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[y, “ yt] a ly, Y4-4]SGP + [zo 3 zn 
—— a ‘ee 


c c Zz 
Boolean equations y; = f{xo.x,.;) are multiplied by 
the kxn mawix SGP (for more details on Boolean cir- 
cuit encryption, look at Figure 3). G is a generating 
matrix for a [n,k,t) Goppa code C [27] and t¢ is the 
number of errors that the code is able to correct. P is a 
random nXn_ permutation matrix. Because of the 
importance of hiding the systematic form of the code 
[13], an additional matrix S is used. S is a random dense 
k xk non-singular matrix. S, G and P are kept secret by 
Alice. The SGP matrix multiplication leads to a linear 
composition of each cleartext Boolean equation of c. 
The difference with respect to the original McEliece 
scheme is that the result of the encryption is interpreted 
as Boolean equations defining the encrypted circuit. 


In addition to the SGP multiplication, algorithm E, 


introduces errors in the circuit in order to detect integ- 
rity attacks and prevent confidentiality attacks as 
explained below. As error-correcting codes, Goppa 


GQ) y= f(x) 
ee 


y =c(x) 
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codes allow to efficiently remove these errors at decod- 
ing time. Errors introduced by E, can be viewed as an 
error circuit that, given an [-bit argument, returns an n- 
bit string with a Hamming weight of ¢ that is computa- 
tionally indistinguishable from a random n-bit vector 
with the same weight. Such a function, called z, compu- 
tationally indistinguishable from the set of functions sat- 
isfying the weight restriction exists. [31] proposes an 
efficient construction for functions that output words of 
a given weight. 


Encrypted Circuit Evaluation and Verification 


Once c’ is created, the protocol between Alice and Bob 
is the following: 


e Alice sends c’ to Bob. 


© Bob evaluates c’ on his data xe X and gets the 
result y= c'(x)e€ Y’. There is an increase in the 
number of Boolean outputs while the number of 
inputs is kept unchanged. The result y' is then sent 
to Alice. 


Xo 
@ xX; Yoro 
X2 \ eco 
| ore 
Yop = Cgp(X) 
© Yero = ok, + x0, 
YGP,) = woe ; YGP,6 = oe. 


partially encrypted function 


ao 


truth table 
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Figure 3: Encrypting a circuit : basic steps 
For clarity sake, only GP matrix multiplication is represented. It produces a partially encrypted circuit cgp To obtain the 
encrypted circuit c’, it necessary to use S and 2 too. The function to encrypt (1) is represented as a Boolean circuit c (2). The 
output matrix ¥(3) is multiplied by GP (4). The result (5) is the partially encrypted output matrix Y¢p It can be represented by 
the corresponding Boolean equations (6) or as a “partially encrypted circuit” cgp (7). 
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e Alice decrypts the result (algorithm D4). She first 


removes permutation P: yp" = ySG +zP'. Per- 
muting individual error contributions does not 
change the Hamming weight of the vector and thus, 


w(zP™') = w(z). The vector z is a correctable error 
vector since it is defined as: w(z) = ¢ exactly. The 
decoding algorithm for the code generated by G can 
correct an error with a weight of at most ¢, thus Alice 
is able to retrieve the cleartext result y= c(x)e€ Y 


and the error vector z from zP™. 


e Alice finally performs the integrity verification 
(algorithm Va): if w(z) = 1, the output is accepted, 
tampering with the evaluation of c’ is assumed other- 
wise. For integrity verification, the error weight is 
fixed. The maximum error weight that can be cor- 
rected using Goppa codes is ¢. 


2.3 Scheme evaluation 


Confidentiality of Execution 


Confidentiality of execution relies on the hardness of 
retrieving the equations of the circuit c after their multi- 
plication with matrix SGP and after adding the error 2. 
First of all, an enumeration attack to recover the circuit c 
directly from c’ is unfeasible using the code size pro- 
posed by McEliece ( [n=1024, k=524, t=50] ). Moreover 
this attacker requires the public key that is not even 
available to him here. 


Retrieving the error circuit z from circuit c’ is 
another possible attack. It is equivalent to trying to 
retrieving a subspace from a set of codewords with 
errors. In another context, this problem was termed 
Decision Rank Reduction [38] and was proven to be 
NP-complete. In order to avoid this attack our solution 
is based on errors with a Hamming weight equal to the 
maximum correction capability of the code. 


Nonetheless, transformation E,4 does not hide every- 
thing about circuit c. Bob can identify inputs (x, x;) 
that have the same cleartext output y, = y; because 
the distance d(y,', y;) between the ciphertext values 
will be small so that errors remain correctable. In that 
Ye By. = yj SGP +z, +y,;-SGP +z, = 
z; +z, (mod 2). An attacker would be able to recognize 
such values because the Hamming weight of their sum is 


at most equal to 2¢ even though he does not know the 
result cleartext value y,. Differential cryptanalyse 


case, 


exploiting the fact that the error circuit z does not com- 
pletely hide the linearity of the transformation were 
described in [13] and [7]. These attacks only apply 
when the public key is available but this one is kept 
secret in our scheme. Moreover, the identification of 
ciphertexts corresponding to the same cleartext can be 
suppressed [35] but implies an increase in the computa- 
tional complexity of the encryption, decryption, and ver- 
ification algorithms. 


The majority voting attack described in [30], [39] 
exploits the non-deterministic nature of the cryptosys- 
tem to recover the secret code. The probability of suc- 
cess of this attack depends on getting a high number of 
different ciphertexts for each plaintext. This is not possi- 
ble in our scheme that keeps secret the public key. 


Integrity of execution 


Let us establish the probability that the verifier accept an 
invalid y’. The set of acceptable cleartext outputs is 


Yce{qa}j * Due to the definition of the error function, it 
is assumed that it is hard to establish any link between 
the inputs to the error function and the error patterns. 
Thus, picking a random encrypted output value 
y'€ {Q1}" has a probability 5 of being accepted. A’s 
result y can only be valid if its value is an element of 


Yc (Q1} ‘| The probability of a successful attack can 
be calculated as ‘the probability of choosing an y exist- 
ing in Y” times “the probability of generating a correct 
error weight”. Generating a correct error means finding 
a t bit vector chosen at random among » bits. The worst 


case (|¥| = a ) leads to a probability of a successful 
a2 ) . For a Goppa code 
(n=1024, k=524, t=101], the probability of a successful 


; -215 
attack is: 6<2 


attack of: 


x 


Ibits | I bits 





Figure 4: Modular implementation of the 
encrypted function 
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Circuit Size Evaluation 


The circuit c’ being evaluated by a remote host, it is 
important to minimize the impact of encryption on the 
circuit size, measured by its number of logical gates. 
Expansion rate cannot be calculated because it is spe- 
cific to the structure of the original circuit. However, it is 
possible to study the worst case. The encrypted function 
can be implemented by a modular circuit c’ as shown in 
Figure 4. 


In practice, the encrypted function is implemented by 
a circuit based on simplified equations, the size of this 
circuit being necessarily smaller than that of the equiva- 
lent modular circuit shown in Figure 4. The circuit 
based on simplified equations offers the same function- 
ality as the three different modules of Figure 4. Integrity 
and confidentiality properties of the protection scheme 
do not allow an attacker to retrieve the modular circuit 
from the equations. The size of the actual encrypted cir- 
cuit is smaller than the sum of the sizes of the three 
modules: 


Size, < Sizesgp + Size, + Sizeyor + Size, 
FS SVS 


encryption represented original 
as Circuits (a ) circuit 


The matrix SGP multiplication transforms the k 
cleartext circuit outputs y; into 2 encoded outputs that 


are noted yoGp; : for instance, ysgeg = Jy; + (Oy2) + 


Ty3 + (@y4)+ I'ys, and soon. For a given fixed number 


of inputs and outputs, the size of the SGP-encoding cir- 
cuit (Sizesgp) is fixed (proportional to k- ) and the 


Oe 


we: 2) 
original_t__—____m> > c.s. 


control 
2 [Prey] Vreu| 





structure 


size of the error circuit (Size,) can be chosen. Both are 
independent of the original circuit size (Size,) and thus 
Size.. < Size. + a . When the circuit is simple @.g. y 
= NOT(x) or y=2x), then Size. << a and encrypting a 
circuit increases significantly the circuit size. However, 
when the circuit is large compared with the sizes of 


equivalent circuits for SGP multiplication and error, 
then Size, > a and encrypting the circuit has a negligi- 


ble effect on the circuit size increase: Size, = Size, 
(the size of the XOR function being negligible). 


3 Protecting Functions within a Program 


As explained in the previous section, function protection 
only allows the execution of one function while a pro- 
gram has to perform several functions in sequence. A 
Naive approach is to represent a program as a single 
Boolean circuit and protect this one using the function 
protection scheme described above. Unfortunately, this 
approach, which requires huge circuits, is totally unreal- 
istic. 


We propose a solution to the problem of software 
protection that consists in delegating the verification and 
decryption tasks (originally performed by the circuit 
owner A) to a trusted tamper-proof hardware (TPH) 
located at the untrusted site (a preliminary proposal was 
described in [26]). This TPH must be directly accessible 
by the untrusted host in order to suppress all interactions 
with the code owner (A) when the software is executed. 
The suggested solution assures the security of the cir- 
cuits executed on an untrusted runtime environment. It 
also makes it possible to securely perform multi-step 
evaluation of functions. Additionally, this scheme ena- 


B(untrusted) | eed 





Figure 5: Installation of the program on the untrusted host and trusted TPH. 
Ahas provided a 7PHto 8. 1) the functions used by the program (i.e. Boolean circuits in our implementation) are encrypted; 
2) the control structure is modified to call those new functions; 3) the verification and decryption algorithm corresponding to 
the encryption are generated 4) the encrypted functions are sent to the untrusted host; 5) The control structure and the 
verification and decryption algorithms are sent to the 7PH using a secure channel. 
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Figure 6: Evaluating an encrypted program on the untrusted host . 
6) The TPH calls a function according to the control structure; 7) the TPH can input intermediate results; 8) the selected 
encrypted function is executed with data provided by B and/or TPH ; 9) The encrypted result is returned to the TPH; 10) the 
inputs are used to allow a more efficient verification (see section’ 3. 2); 11) the temporary result is stored; 12) the next 


encrypted function is called .. 


bles a program to deliver a cleartext result to the 
untrusted host without having to contact the code owner. 


Assuming a wide deployment of mobile codes makes 
it unlikely that expensive tamper-proof hardware be 
used: this implies that the TPH will be limited in terms 
of storage and computational capacity. Even though our 
solution for multi-step execution is based on the protec- 
tion technique described in the Section 2, it has to be 
adapted to cope with the computation power limitations 
imposed by the TPH (look at Section 3.2). The use of a 
TPH has already been suggested for delegating the func- 
tionality of a trusted party in specific contexts as can be 
seen in host-assisted secret key [8] or public key [17] 
cryptography applications. [6] proposes to separate a 
program into several pieces but does not deal with 
encrypted functions. 


3.1 Computational Model 


A program can be modeled as a set of functions plus a 
control structure, which defines the sequencing of func- 
tions. As in the previous section, functions are imple- 
mented with circuits. The computation of each 
individual circuit c; depends on a set of inputs x received 
from the host and from the memory of the TPH. As 
before, the protection of each circuit from the untrusted 
environment where it is evaluated is achieved through its 
encryption. The control structure is uploaded to the 
tamper-proof hardware to protect it. Based on this con- 
trol structure, the TPH instructs the untrusted environ- 


ment to execute one of the encrypted circuits c;’. For 
each output of circuit c,’, the TPH is able to verify the 


integrity of the result and to retrieve the cleartext result y 
in an efficient way. Each circuit c; is encoded with anew 


algorithm E’,. 


A state of the computation can be maintained in the 
trusted TPH, in other words memory attacks need not be 
taken in consideration. It is mandatory that B receives a 
TPH trusted by A, which is not a very restrictive hypoth- 
esis. For instance, A could be a bank or an operator that 
provides a smart card to its client B, just like they 
already provide credit cards and SIM cards. A verifica- 
tion and decryption algorithm must be installed on the 
TPH via a secure channel, either before the TPH is dis- 
tributed to clients or transmitted in encrypted form using 
a secret shared by A and her TPH. 


Once the encrypted circuits (c’; ... c’,) are installed 


(Figure 5) on host B, the TPH is in charge of choosing 
which function has to be evaluated and of providing a 
part of the inputs (x7p;,;), the other part being provided 
by B (xg). After each step (i.e. each encrypted function 
evaluation), the TPH deciphers and verifies the returned 
result (Figure 6). Note that a given function can be eval- 
uated more than once with different inputs. When the 
TPH chooses the next encrypted function to execute 
(Ci4n), it provides input data (xypy; ;4,,). Those data are 
stored on the TPH. 
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3.2 Protection Scheme 


The algorithms devised in the protection scheme of Sec- 
tion 2 have to be adapted to the new scenario in which 
the TPH has less computational power than the party 
that it represents. The algorithm used for function 
encryption remains the same since this operation is per- 
formed by the code owner, but the error computation is 
modified in order to simplify the verification performed 
by the TPH. The new verification and decryption algo- 
rithms are respectively called Vrpy and Drp,. In order 


to simplify the verification and decryption, xg is trans- 
mitted to the TPH. 


The integrity of execution relies on the difficulty of 
creating forged pairs (x, y') that pass the verification 
process (xzp;; being known). Using the same terminol- 
ogy as in the previous section, this probability can be 
defined as follows: 


if y#c(x) then P(Vrpy(x, y') = Accept) <8 


x being (xrpy | xg) 
Error Circuit 


As in the classical McEliece scheme, the function pro- 
tection scheme of Section 2 introduced at most (and in 
our case, exactly) ¢ errors into the encoded circuit: this 
represents the maximum number of correctable errors 
using the capacity of the code, since in the scheme, A 
did not know B’s input x. This value is now retrieved on 
the TPH that also possesses the error circuit 2(x) and can 
entirely suppress the error without restraining to a spe- 
cific correction capacity. It is thus possible to introduce 
much more weighted errors into the encrypted circuit. 


The security parameter q, with 0<q <n, indicates 
the maximum weight of the error introduced. Using a 
Goppa code [n=1024, k=524, t=10I], this parameter 
might be as high as 1024 bits, meaning that all bits of a 
given output y’ might be inerror, instead of t=101 bits 
as in the scheme of Section 2. In the general case, the 
number of errors introduced will be smaller than this 
upper bound, yet higher than the correctable case. This 
considerably limits enumeration attacks for retrieving 
the error circuit. 


Nonetheless, the error circuit size must remain rea- 
sonable to retain any advantage from executing c’ on the 
untrusted host rather than c on the TPH. For the con- 
struction of the error circuit, a trade-off should be found 
between the highest possible number » of simple error 
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equations and a smaller number of more complex equa- 
tions closer to a random error. 


Result Decryption 


In the new scheme, the decipherment is based on the 
inputs and outputs of the encrypted function evaluation. 
For each evaluation of circuit c’, the TPH, which knows 
Xrpy, receives y’ and xg. The encrypted result can be 


written: y= ySGP + 2(x7py|Xa) - S, G, and P being 
known to the TPH, as well as the error circuit 2(x), it is 
possible to first compute and remove the error pattern: 


ySGP = y'+ 2(Xppy|Xp) 


Since matrix G is in systematic form (J | A), the GP 
encoding can be removed as follows: 


ySG = yS(IJA) = (y'+2(x7pyq|%g))P (91) 
‘ -1 
yS = yS1=[(y'+z(xzpy|xp))P" |, (e2) 
where fv]; is the vector formed by the first i bits of 
vector v. 


The cleartext output y can finally be retrieved as follows: 


yo> [ o" % z(Xzpy|%p))P 1 st 


Integrity Verification 


The verification algorithm is adapted to the new con- 
struction of the error circuit. Integrity of execution is 
ensured by controlling that all bits of the cleartext out- 
put are correct after having removed the supposed error 
pattern. Since the cleartext output y can be obtained by 
using only the first k bits of ySG, the remaining and 
redundant n-k bits are used to verify that the output 
computed by the untrusted host has not been tampered 
with. 


From decryption equation / (above), and using only the 
last n-k bits of ySG : 


ySA = Lo" * <(xrpu|¥e))P™ Jina 
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where [|v]; is the vector formed by the last é bits of 
vector ¥. 


Since ySA = (yS)A, it can be deduced from decryption 
equation 2 that the TPH needs only verify that the fol- 
lowing equation is satisfied: 


[ov + AeaneelaVPT lei = 
[ ("+ cCxppy|xe))P |, A 
3.3 Scheme Evaluation 


Confidentiality of Execution 


Evaluating the confidentiality relies on the same princi- 
ples as in the previous section. Since it is now possible 
to introduce more weighted errors, the complexity of 
retrieving the initial circuit is increased in a ratio 
depending upon the chosen security parameter q. 


A new problem is introduced by the multi-step execu- 
tion concerning intermediate cleartext results. It some- 
times happens that xrpy ; = Drp;{yj.,’), that is, B can 


observe the cleartext result of a previously computed 
function. The fact that the cleartext result may be given 
back to the untrusted environment is critical. With a suf- 
ficient number of pairs of cleartext inputs and outputs, 
the untrusted host would be able to interpolate circuit c. 
For more details, refer to the limitations section below. 


Integrity of Execution 


An enumeration attack amounts to obtaining a forged 
pair (x, y’) acceptable for the verifier. Since the TPH can 
get access to the input in addition to the encrypted out- 
put of circuit evaluation, the verification is now per- 
formed using the actual error pattern and not the error 
weight as before. The probability of such an attack 
being successful is thus even smaller than in the scheme 
of Section 2. Like for confidentiality, the use of more 
weighted errors can even further increase the difficulty 
of breaking the integrity of execution. 


Moreover, part of the input is provided by the TPH 
and cannot be modified by the untrusted host: this 
makes it possible to obtain the equivalent of a variable 
error pattern for the inputs restricted to xg, the untrusted 
host inputs. In other words, even if the untrusted host B 
is able to determine the error pattern for a given x7py, 


this pattern will not be useful for another value of x;py. 
In practice, x7py will vary for nearly each computation 
of a given function throughout the lifetime of a program. 


Limitations 


As to the limitations of this approach, it is obvious that 
the algorithmic structure of the protected program is not 
hidden: the repeated execution of a function can be 
traced. Private Information Retrieval techniques (PIR) 
[15] and oblivious RAM models [20] that hide the 
sequence of accesses provide a sophisticated solution to 
this problem. Unfortunately, those works have shown 
that hiding access patterns is prohibitively expensive. 
This is the reason why our scheme addresses the protec- 
tion of each function used by a program rather than the 
protection of its algorithmic structure. 


The result of some encrypted function f; can be used 
as the input to another function fj. This means that a 


malicious host can sometimes observe the cleartext 
result of one of the encrypted functions that form the 
building blocks of a program. Depending on the func- 
tion and on the number of times it is evaluated, it might 
be possible to obtain enough cleartext inputs and out- 
puts to interpolate the function. In order to avoid this 
problem a scheme using enciphered inputs and outputs 
could be used but the performance penalty is important. 


However, even if the confidentiality of execution is 
partly broken, the integrity is not attacked. Indeed, inter- 
polating a function allows an attacker to compute all 
corresponding cleartext outputs y, but knowing y and y’ 
is already not sufficient to break the McEliece crypto- 
system. 


Implementability 


Any function with fixed-length inputs and outputs can 
be represented as a Boolean circuit and thus encrypted 
by our scheme. However, to have an efficient implemen- 
tation, it is necessary to define a computation model fit- 
ting the requirement of this approach. The TPH and the 
untrusted host have to support two different computing 
models: 


e An algorithmic logic similar to what can be found in 
smart cards . 


e A functional mode! supporting a representation of 
Boolean circuits. It could be implemented as truth 
tables (memory) or as circuits (programmable logic). 
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4 Related Work 


This section presents other work related to the confiden- 
tiality or the integrity of execution. We also compare our 
solution with two other approaches, truth table encryp- 
tion and gate-level encryption, that can be used to pro- 
tect functions represented as Boolean circuits. 


4.1 Confidentiality of Execution 


Secure function evaluation is an instance of the more 
general problem of confidentiality of execution. Secure 
function evaluation has been addressed by many 
researchers ((41], [42], [19], [3], and [2], just to mention 
a few). Non-interactivity is an important requirement 
for mobile code, but the protocols addressing the circuit 
model need a round complexity dependent on the 
number of gates or depth of the circuit and are thus not 
well adapted to mobile code. 


Sander and Tschudin [33], [32] defined what they 
called a function hiding scheme and focused on non- 
interactive protocols. In their framework, the privacy of f 
is assured by a encrypting transformation. The authors 
illustrated the concept with a method that allows com- 
puting with encrypted polynomials, based on the Gold- 
wasser-Micali encryption scheme [18]. Sander and 
Tschudin took advantage of the homomorphic proper- 
ties of the above encryption scheme to encrypt the coef- 
ficients of the polynomial, thus their technique does not 
hide the skeleton of the polynomial. Moreover, polyno- 
mials are not as expressive as Boolean circuits. 


(36] presented a non-interactive solution for secure 
evaluation of circuits but which is restricted to log depth 


circuits (or NC! circuits). Protocols were designed for 
processing NOT and OR gates in a private way. The 
restriction on the depth of the circuit comes from the 
increase of the output size by a constant factor when 
computing an OR gate. 


In [14] and [1], more efficient techniques are pre- 
sented, that combine encrypted circuits [42] and one 
round oblivious transfers. However, the circuit expan- 
sion is high with this technique and this expansion com- 
promises the narrow advantage in performance of 
mobile code as shown in [22]. 


4.2 Integrity of Execution 


Integrity of execution is the possibility for the circuit 
owner to verify the correctness of the execution of his 


circuit. This problem has been extensively studied for 
achieving reliability (see for example [12] for a survey) 
but security requirements taking into account possible 
malicious behavior from the execution environment 
were not considered. 


Other solutions cope with the maliciousness of the 
execution environment. Yee [44] suggested the use of 
proof based techniques, in which the untrusted host has 
to forward a proof of the correctness of the execution 
together with the result. Complexity theory shows how 
to build proofs for NP-languages and recently how to 
build Probabilistic Checkable Proofs (PCP) [4], [5]. 
PCP proofs require checking only a subset of the proof 
in order to assure the correctness of a statement. How- 
ever, this subset has to be randomly determined by the 
checker, so the problem of using PCP proofs in our non- 
interactive scenario is that the prover has to commit to 
the overall PCP proof. We refer the interested reader to 
(21] for a comprehensive survey of the work on proofs. 


In [11], the authors presented an interesting model 
for mobile computing and a solution that overcomes the 
problem of using PCP proofs. The agent is modeled as a 
probabilistic Turing machine, and the set of all possible 
states of this machine constitutes a NP language. There 
exists a verification process for language membership, 
that is, it is possible to check if an obtained state belongs 
to the language. This scheme relies on the use of non- 
interactive Private Information Retrieval techniques to 
avoid the transmission of the overall PCP proof of the 
specified language, the randomly chosen queries from 
the checker being encrypted. Our second scheme allows 
us to “trace” in real-time an execution step-by-step, one 
step being a function evaluation, and ensuring that each 
step is performed in accordance with the program 
semantics. In our scheme, verifying an execution does 
not require verifying a complex trace. 


4.3 Encrypted Boolean Circuit Approaches 


Boolean circuits can be protected from confidentiality 
and integrity attacks using three different encryption 
techniques: circuit encryption (our approach from Sec- 
tion 2), truth table encryption, and gate-level encryption 
[36], [19]. 


Circuit Encryption vs. Truth Table Encryption 


Evaluating a function with a truth table simply corre- 
sponds to choosing the right line of the table that corre- 
sponds to the inputs and contains the outputs of a circuit. 
A simple protection of this scheme is to encrypt line-by- 
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line each output of the truth table with a standard 
encryption algorithm. A new truth table with encrypted 
outputs is then obtained. In this approach, each result is 
pre-calculated. 


The truth table outputs being encrypted line by line, 
the encrypted function it represents is by definition con- 
structed randomly. Shannon showed that the size of 
almost every function with / inputs and one output is 


bigger than 2A. The size of the circuit implementing an 
encrypted truth table with 2 outputs can thus be assumed 


to be bigger than n “2 gates. This size does not depend 
on the initial cleartext circuit but essentially on the 
number of inputs and can be bounded: any function with 
! inputs and one output can be computed by a circuit of 


size O(! °2') [40]. 


Our scheme modifies Boolean equations rather than 
outputs and, as shown in Section 2.3, Size. = Size, + 
a .Itis possible to have Size.» ~ Size, under reasona- 
ble assumptions about the size of c compared with the 
size of the error circuit or the SGP multiplication circuit 
(in a modular implementation). In the worst case, when 


the cleartext circuit size Size, is close to 1 ‘2, the size 
Size,: of the resulting encrypted circuit is not better than 
the size of an equivalent encrypted truth table. 


Circuit Encryption vs. Gate Level Encryption 


The gate-level encryption [36], [19] is a Computing 
with Encrypted Data scheme. Each gate of the circuit is 
replaced by a cryptographic module that use keys as 
inputs and outputs to represent true or false Boolean val- 
ues. A function evaluation corresponds to cryptographic 
operations performed gate by gate. However, it is possi- 
ble to observe the resulting construction and to deduce 
the initial circuit. This solution thus does not ensure 
confidentiality but only integrity. Valiant’s universal cir- 
cuit [37] makes it possible to see circuits as data. Thanks 
to it, it is possible to convert the Computing with 
Encrypted Data scheme into Computing with Encrypted 
Functions. 


The main advantage of the gate-level encryption 
scheme is the linear impact of the encryption on the cir- 
cuit size. Indeed, each gate of the initial circuit is 
teplaced by one module and associated keys. When the 
universal circuit is used, the resulting size is 
O(d *s *log(s)) modules, where s is the size of the initial 
circuit and d its depth. This size increase is small and 
only depends on the initial circuit. 


This approach has drawbacks: it is necessary either to 
interact with the circuit owner for each gate evaluation 
or to use oblivious transfers to provide inputs. Moreo- 
ver, the scheme allows only one evaluation of the circuit 
otherwise the integrity, and confidentiality in case of use 
of a universal circuit, cannot be ensured. 


In comparison, our approach is similar to the truth 
table in that it is an encryption of the whole output that 
is performed instead of a bit-by-bit encryption of the 
output. This makes it possible to evaluate a function 
more than once. Bit-by-bit encryption yields too much 
information about the circuit structure to permit two 
consecutive evaluations: a new encrypted circuit has to 
be recomputed after each evaluation. 


5 Conclusion 


This paper presented basic building blocks for securing 
mobile code executed in a potentially hostile environ- 
ment. It first described a scheme that can autonomously 
evaluate a Boolean circuit in a potentially malicious 
environment. This scheme ensures at the same time the 
integrity and confidentiality of evaluation of the circuit. 
The protection is derived from the McEliece data 
encryption scheme, thus allowing an efficient encryp- 
tion, verification and decryption. The original circuit is 
encrypted into a new circuit, which can be executed by 
an untrusted environment although its result can only be 
decrypted by the circuit owner. This scheme can gener- 
ate an encrypted circuit with a size close to that of the 
original circuit. All functions implementable with 
Boolean circuits can be protected using this scheme. 


Any program can be implemented as a single circuit 
and thus be protected using this function protection 
scheme. In practice however, that approach is totally 
unrealistic because of the huge size of the circuit. The 
second part of this article introduces another protection 
scheme that deals with programs rather than functions. 
This scheme resorts to using a tamper-proof hardware, 
albeit with a limited capacity compared with the pro- 
gram processing needs. The tamper-proof hardware pro- 
tects the scheduling of a set of encrypted functions 
executed directly in the untrusted environment. The 
tamper-proof hardware also performs the result decryp- 
tion and verification, which were previously done by the 
code owner. 


The practical deployment of the latter scheme may 
finally be questioned because of the lingering require- 
ment for a tamper-proof hardware. However, few years 
ago, authentication had similar needs and now such 
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hardware is in wide use for authentication purposes. We 
envision the use of cheap tamper-resistant hardware like 
slightly modified smart cards as a possible solution. 


References 


[1] J. Algesheimer, C.Cachin, J. Camenisch, and 
G. Karjoth. Cryptographic security for mobile 
code. In Proc. of the IEEE Symposium on Security 
and Privacy, May 2001. 


[2] Martin Abadi and Joan Feigenbaum. Secure cir- 
cuit evaluation. Journal of Cryptology, 2(1):1-12, 
1990. 


[3] Martin Abadi, Joan Feigenbaum, and Joe Kilian. 
On hiding information from an oracle. Journal of 
Computer and System Sciences, 39(1):21-50, Au- 
gust 1989. 


[4] Sanjeev Arora, Carsten Lund, Rajeev Motwani, 
Madhu Sudan, and Mario Szegedy . Proof verifica- 
tion and hardness of approximation problems. In 
Proc. 33rd IEEE Foundations of Computer Sci- 
ence, pages 14-23, October 1991. 


{S] Sanjeev Arora and Shmuel Safta. Probabilistic 
checking of proofs: A new characterization of NP. 
Journal of the ACM, 45(1):70-122, 1998. 


[6] D.Aucsmith. Tamper resistant software: an imple- 
mentation. In Proc. International Workshop on In- 
formation Hiding, 1996. Cambridge, UK. 


[7] Thomas A. Berson. Failure of the McEliece pub- 
lic-key cryptosystem under message-resend and 
related-message attack. In Burton S. Kaliski Jr., 
editor, Advances in Cryptology—CRYPTO ’97, 
volume 1294 of Lecture Notes in Computer Sci- 
ence, pages 213-220. Springer-Verlag, 17- 
21 August 1997. 


[8] Matt Blaze, Joan Feigenbaum, and Moni Naor. A 
formal treatment of remotely keyed encryption. In 
Kaisa Nyberg, editor, Advances in Crytology - 
EUROCRYPT ’98, Lecture Notes in Computer 
Science, pages 251-265, Finland, 1998. Springer- 
Verlag. 


{9] Manuel Blum and Sampath Kannan. Designing 
programs that check their work . In Proceedings of 
the Twenty First Annual ACM Symposium on The- 
ory of Computing, pages 86-97, Seattle, Washing- 
ton, 15-17 May 1989. 

{10} Matt Blaze. High-bandwidth encryption with low- 
bandwidth smartcards. In Dieter Grollman, editor, 
Fast Software Encryption: Third International 
Workshop, volume 1039 of Lecture Notes in Com- 
puter Science, pages 33--40, Cambridge, UK, 21- 


CARDIS ’02: 54 Smart Card Research & Advanced Application Conference 


[11] 


[12] 


[13] 


[14] 


[15] 


[16] 


[17] 


[18] 


[19] 


[20] 


[21] 


[22] 


23 February 1996. Springer-Verlag. 


Ingrid Biehl, Bernd Meyer, and Susanne Wetzel. 
Ensuring the integrity of agent-based computa- 
tions by short proofs. In Kurt Rothermel and Fritz 
Hohl, editors, Proc. of the Second International 
Workshop, Mobile Agents 98, pages 183-194, 
1998. Springer-Verlag Lecture Notes in Computer 
Science No. 1477. 


Manuel Blum and Hal Wasserman. Software reli- 
ability via run-time result-checking. Journal of the 
ACM, 44(6):826-849, November 1997. 


Anne Canteaut. Attaques de Cryptosystémes a 
Mots de Poids Faible et Construction de Fonc- 
tions t-Résilientes. PhD thesis, Université Paris 
VI, October 1996. 


C. Cachin, J.Camenisch, J. Kilian, and Joy 
Muller. One-round secure computation and secure 
autonomous mobile agents. In Proceedings of the 
27th International Colloquium on Automata, Lan- 
guages and Programming-ICALP 2000, Geneva, 
July 2000. 


B. Chor, O. Goldreich, E. Kushilevitz and M. 
Sudan, Private information retrieval, Proceedings 
of 36th IEEE Conference on the Foundations of 
Computer Science (FOCS), p. 41--50, 1995. 


C. Collberg, C. Thomborson and, D. Low, A tax- 
onomy of obfuscating transformations, Technical 
Report 148, Department of Computer Science, 
University of Auckland, 1996. 


Joan Feigenbaum. Locally random reductions in 
interactive complexity theory. DIMACS Series in 
Discrete Mathematics and Theoretical Computer 
Science, 13:73-98, 1993. 


Shafi Goldwasser and Silvio Micali. Probabilistic 


encryption. Journal of Computer and System Sci- 
ences, 28(2):270-299, April 1984. 


Oded Goldreich, Silvio Micali, and Avi Wigder- 
son. How to play any mental game or a complete- 
ness theorem for protocols with honest majority . 
In Proceedings of the Nineteenth Annual ACM 
Symposium on Theory of Computing, pages 218- 
229, New York City, 25-27 May 1987. 


Oded Goldreich and Rafail Ostrovsky. Software 
protection and simulation on oblivious RAMs. 
Journal of the ACM, 43(3):431-473, May 1996. 


Oded Goldreich. Modern Cryptography, Probabi- 
listic Proofs and Pseudorandomness. Springer- 
Verlag, 1999. 


Daniel Hagimont and Leila Ismail. A performance 
evaluation of the mobile agent paradigm. In Pro- 
ceedings of the Conference on Object-Oriented 


USENIX Association 


[23] 


[24] 


[25] 


[26] 


[27] 


[28] 


[29] 


[30] 


(31) 


USENIX Association 


Programming, Systems, Languages and Applica- 
tions, pages 306-313, Denver-USA, November 
1999, 


Sergio Loureiro and Refik Molva. Function hiding 
based on error correcting codes. In Manuel Blum 
and C. H. Lee, editors, Proceedings of Cryptec'99 
- International Workshop on Cryptographic Tech- 
niques and Electronic Commerce, pages 92-98. 
City University of Hong-Kong, July 1999. 


Sergio Loureiro and Refik Molva. Privacy for Mo- 
bile Code. In Proceedings of the Distributed Ob- 
ject Security Workshop - OOPSLA’99, pages 37- 
42, Denver, November 1999. 


Sergio Loureiro, Refik Molva, and Yves Roudier. 
Mobile code security. /n Proceedings of ISY- 
PAR’2000, 4éme Ecole d’Informatique des 
Systémes Paralléles et Répartis, Toulouse, France, 
February 2000. 


Sergio Loureiro and Refik Molva. Mobile Code 
Protection with Smartcards. In 6th ECOOP Work- 
shop on Mobile Object System. Cannes. France. 
June 2000. 


R.J. McEliece. The theory of information and cod- 
ing, Encyclopedia of Mathematics and its Applica- 
tions, Vol. 3, Addison-Wesley, Reading, MA, 
1977. 


R.McEliece. A public-key cryptosystem based on 
algebraic coding theory. In Jet Propulsion Lab. 
DSN Progress Report, 1978. 


Silvio Micali. CS Proofs (extended abstract). In 
IEEE Proceedings of Foundations on Computer 
Science, pages 436-453, 1994. 


Joost Meijers and Johan van Tilburg. Extended 
Majority voting and private-key algebraic-code 
encryptions . In Hideki Imai, Ronald L. Rivest, and 
Tsutomu Matsumoto, editors, Advances in Cryp- 
tology—ASIACRYPT '91, volume 739 of Lecture 
Notes in Computer Science, pages 288-298, Fujiy- 
oshida, Japan, 11-14 November 1991. Springer- 
Verlag. Published 1993. 


Nicolas Sendrier. Efficient generation of binary 
words of given weight. In Colin Boyd, editor, 
Cryptography and Coding; proceedings of the 5th 
IMA conference, number 1025 in Lecture Notes in 
Computer Science, pages 184-187. Springer-Ver- 
lag, 1995. 


[32] 


[33] 


[34] 


[35] 


[36] 


[37] 


[38] 


[39] 


[40] 


[41] 


[42] 


[43] 


[44] 


Tomas Sander and Christian Tschudin. On soft- 
ware protection via function hiding. In Proceed- 
ings of the Second Workshop on Information 
Hiding, Portland, Oregon, USA, April 1998. 


Tomas Sander and Christian Tschudin. Towards 
mobile cryptography. In Proceeding of the 1998 
IEEE Symposium on Security and Privacy, Oak- 
land, California, May 1998. 


Luis F. G. Sarmenta. Volunteer Computing. Ph.D. 
thesis. Dept. of Electrical Engineering and Com- 
puter Science, MIT, March 2001. 


Hung-Min Sun. Improving the security of the 
McEliece public-key cryptosystem. In Proceed- 
ings of Asiacrypt 98, pages 200-213, 1998. 


Tomas Sander, Adam Young, and Moti Yung. 
Non-interactive cryptocomputing for NCI. In 
Proceedings of the IEEE FOCS, October 1999. 


Leslie G. Valiant. Universal circuits (preliminary 
report). In Conference Record of the Eighth Annu- 
al ACM Symposium on Theory of Computing, pag- 
es 196-203, Hershey, Pennsylvania, 3-5 May 
1976. 


A. Valembois. Recognition of binary linear codes 
as vector-subspaces. In Workshop on Coding and 
Cryptography’99, Book of abstracts, pages 43-51, 
Paris, France, January 1999. 


Johan van Tilburg. Security-Analysis of a Class of 
Cryptosystems Based on Linear Error-Correcting 
Codes. PhD thesis, Technische Universiteit Eind- 
hoven, 1994. 

Ingo Wegener. The Complexity of Boolean Func- 
tions . Eiley-Teubner, 1987. 

A.C. Yao. Protocols for secure computations. In 
IEEE Symposium on Foundations of Computer 
Science 82, pages 160-164, Chicago, 1982. 

A.C. Yao. How to generate and exchange secrets. 
In JEEE Symposium on Foundations of Computer 
Science 86, pages 162-167, Toronto, 1986. 
Bennet Yee. Using Secure Coprocessors. Techni- 
cal Report CMU-CS-94-149. School of Computer 
Science, Carnegie Mellon University. May 1994. 
Bennet Yee. A sanctuary for mobile agents. Tech- 
nical Report CS97-537, UC at San Diego, Dept. of 
Computer Science and Engineering, April 1997. 


CARDIS '02: 5‘h Smart Card Research & Advanced Application Conference 


128 


———— ——_—— 


-% 


MICROCAST: Smart Card Based 
(Micro) pay-per-view for Multicast Services 


Josep Domingo-Ferrer, Antoni Martinez-Ballesté and Francesc Sebé 
Universitat Rovira i Virgili 
Dept. of Computer Engineering and Maths 
Av. Paisos Catalans 26, E-43007 Tarragona, Catalonia 
e-mail {jdomingo ,anmartin, fsebe}@etse.urv.es 


Abstract 


With the increased availability of broadband fixed 
and mobile communications, multicast content de- 
livery can be expected to become a very important 
market. Especially for wireless multicast delivery, it 
is important that payment collection be fine-grain: 
the customer should pay only for the content that she 
actually consumes. This can be achieved by using 
pay-per-view based on micropayments. This paper 
proposes the first method for enabling pay-as-you- 
watch services in a multicast content delivery en- 
vironment. On the customer’s side, micropayment 
generation is implemented in a smart card which can 
be plugged into the customer’s receiving device (com- 
puter, digital video receiver, PDA, mobile phone, 
etc.). Micropayment collection and verification are 
distributed among multicast routers, which avoids 
bottlenecks inherent to many-to-one payment trans- 
mission. 


Keywords: Multicast delivery, Pay-per-view, Pay- 
as-you-watch, Micropayments, Smart cards in the 
Internet. 


1 Introduction 


Communication technologies have been evolving in 
many important aspects over the last few years. On 
one hand, broadband communications such as city- 
wide WLANs, ADSL, cable networks and UMTS 
are becoming widespread. On the other hand, au- 
dio and video compression codecs such as DivX, 
Realmedia, etc. improve the use of the avail- 
able bandwidth. Finally, the appearance of mo- 


bile phones with high-resolution color display and 
Internet-enabled PDA’s will bring brand new mul- 
timedia services to everybody, everywhere. There 
are great opportunities to create a huge market for 
multimedia content delivery, featuring news broad- 
casting, videoconferencing, movie channels, on-line 
gambling, etc. Consequently, it seems natural to 
use mobile communications and portable devices, 
along with traditional desktop PC’s, as new privi- 
leged outlets for digital content delivery in pay-per- 
view mode. Smart card based micropayments stand 
out as one of the most promising solutions to obtain 
a fine-grain fee collection service: the customer uses 
her smart card to perform micropayments as con- 
tent is being received. 


Most multimedia delivery services operate in mul- 
ticast mode to send content over the Internet. By 
using multicast, one single data stream can reach 
hundreds, thousands and even millions of target me- 
dia players. 


1.1 Contribution and plan of this paper 


This paper describes a method for enabling pay- 
per-view services in a multicast content delivery en- 
vironment. On the customer’s side, micropayment 
generation is implemented in a smartcard which 
can be plugged into the customer’s receiving de- 
vice (computer, digital video receiver, PDA, mobile 
phone, etc.). Micropayment collection and verifica- 
tion are distributed among multicast routers, which 
avoids bottlenecks inherent to many-to-one payment 
transmission. 


Section 2 gives some background on multicast com- 
munication; the use of pay-per-view and micropay- 
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ments in multicast is also approached. Section 3 de- 
scribes the architecture of MICROCAST, a system 
for pay-per-view multicast content delivery. The 
MICROCAST micropayment protocol suite is fully 
described in Section 4. Finally, Section 5 contains 
some conclusions and suggestions for future work. 


2 Multicast communication 


Depending on the number of receivers, three types 
of communication can be distinguished: 


e Unicast communication: one source and one re- 
ceiver. 


e Broadcast communication: one source node 
and all remaining nodes acting as receivers. As 
an example, consider video broadcast in a LAN: 
the same data are streamed from the source to 
the entire network by using the broadcast IP 
address. 


e Multicast communication: one source and a 
group of receivers. As an example, consider a 
local digital cable TV network, where a partic- 
ular piece of video content is to be distributed 
only to subscribers who are paying for it (rather 
than to the entire neighborhood). 


2.1 Multicast group management 


If a source is to communicate with n receivers, one 
could naively think of using n unicast communica- 
tions (which results in the source being an output 


bottleneck) or one broadcast channel (which results 


in the entire network being flooded). Both solu- 
tions are wasteful in terms of bandwidth. It should 
be noted that the Internet is nowadays already full 
of millions of IP packets only controlled by their 
time-to-live or by the TCP protocol. 


A better option to avoid increasing network conges- 
tion is for receivers to join a multicast group and 
have the content sent to them by using their multi- 
cast group IP address [5, 14]. A multicast group G 
is a set of receivers that are interested in receiving 
a particular kind of information. 


Efficient multicast design and implementation is 
currently an open issue. The multicast task is car- 
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ried out by multicast routers, which join previously 
established multicast groups identified by a multi- 
cast IP address. These routers are capable of send- 
ing the data flow to multicast group G. 


The basic tasks to be performed in multicast com- 
munication are: advertise the multicast session, 
manage group enrollment by the customers who 
want to receive the stream and, concurrently to 
group enrollment, build the multicast routing tree. 
Several multicast protocols have been proposed 
in the literature, such as MOSPF/[6], PIM-DM{(3], 
PIM-SM{[2]. 


2.2 Pay-per-view and pay-as-you-watch 


The name “pay-per-view” is certainly misleading. 
In current digital TV platforms, a fixed monthly 
fee is paid to subscribe to a basic package of chan- 
nels and services. It is also possible to view some 
special “pay-per-view events” (e.g. movies, football 
matches) by paying in advance the price correspond- 
ing to the event. This form of pay-per-view means 
that the content is viewed after the customer has 
paid. There are at least two problems with the fee 
collection scheme just described. One problem is 
that the customer pays for a basic offer that is usu- 
ally expensive for her. The other problem is that, 
in pay-per-view events, the customer pays for the 
whole piece of content: if she wants to stop watching 
anytime, she is losing a part of her money. Pay-per- 
view as contents are being streamed from the server 
to the customer (pay-as-you-watch) seems an option 
that fits better the customer needs. Successive pay- 
ments can be performed every minute, for example. 
If a customer switches her player off, she only has 
paid for the minutes viewed so far. Of course, these 
frequent payment will be small ones, so credit card 
transactions or electronic payment systems like SET 
are too expensive, too complicated or both [7]. 


2.3. Micropayments 


The operating costs of standard electronic pay- 
ment systems are unaffordable for small amounts 
and can be split into communication and compu- 
tation costs, the latter being caused by the use 
of complex cryptographic techniques such as digi- 


1 Multicast addresses are IP numbers in the range between 
224.0.0.0 and 239.255.255.255 
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tal signatures. Micropayments are electronic pay- 
ment methods specifically designed to keep operat- 
ing costs very low. In most micropayment systems 
in the literature, computational costs are dramat- 
ically reduced by replacing digital signatures with 
hash functions[11]. For example, this is the case of 
PayWord and Micromint[9], where the security of 
coin minting rests on one-way hash functions. 


The main barrier to using traditional micropayment 
schemes for fee collection in multicast environments 
is their lack of scalability: a large number of re- 
ceiving subscribers eventually overload the source 
with payment implosion. The MICROCAST pro- 
tocol achieves scalability by distributing the effort 
of micropayment collection and verification among 
multicast routers. Unlike traditional micropay- 
ment schemes, MICROCAST does not concentrate 
on minimizing computation for micropayment gen- 
eration and verification. By requiring micropay- 
ments to be less frequent (say every few minutes) 
and verification to be distributed, MICROCAST 
canstill use short-exponent discrete exponentiations 
and provide the content source with a proof that ev- 
ery customer has paid. 


More specifically, the scalability of our system is 
based on the following properties not fulfilled by 
conventional micropayment schemes (which are in- 
herently unicast): 


Aggregation Payments collected by routers at one 
level of the multicast tree can be aggregated 
and forwarded to.the next upper level towards 
the source. Each aggregation only requires one 
product and one addition. 


Single-step verification Verifying an aggregated 
payment can be done in a single step. There 
is no need to verify each individual payment 
included in the aggregation, which would im- 
ply non-scalability. Payment verification re- 
quires one short-exponent exponentiation, but 
this is no problem, since verification is per- 
formed only once per micropayment period by 
each tree node (regardless of the number of its 
child nodes). 


Note 1 Asit can be seen in Section 4.4 below, us- 
ing the discrete exponentiation as a one-way func- 
tion is justified by its homomorphic properties, 
which allow payment aggregation and single-step 
verification and are not shared by the (faster) one- 
way hash functions. 


2.4 Rekeying 


Multicast routers form a group that receives a mul- 
ticast data stream. The router will possibly send the 
info to a hub that floods all its output connections, 
thus making the information reach every node in 
the subLAN, including nodes whose customers have 
not paid for the content. Cryptography should be 
used to prevent cheaters from being able to view the 
content by using packet sniffers. Customers in the 
multicast group have a decoding key to be able to 
decode the content they receive. 


Hence, legitimate customers are those who pay ev- 
ery multicast period t (a multicast period typically 
lasts a few minutes). When a customer does not 
pay, she will be considered non-legitimate; in this 
case, a rekeying procedure will start which con- 
sists of distributing a new decoding key to every 
remaining legitimate customer. As a result, the re- 
moval of a group member will involve as many uni- 
cast transmissions as legitimate customers remain 
in the group. Fortunately, rekeying reaches a max- 
imum cost of O(logn) when using tree structure 
controls[12, 1]. Even if rekeying is an important 
multicast issue, the reader of this paper only needs 
to keep in mind that it is the procedure started when 
a customer in a multicast group is removed due to 
lack of valid payment or when a new customer joins 
the group. 


3 MICROCAST architecture 


As it was pointed out in Section 1, MICROCAST 
is a pay-as-you-watch system for multicast content 
delivery. A typical application for MICROCAST 
could the pay-as-you-watch video distribution to 
thousands of customers. By using her smart card 
plugged into her video receiver, a customer can join 
a multicast group when she is interested in watching 
anevent. After joining a group, the customer makes 
a micropayment every period t to keep watching the 
event. 


In a conventional micropayment system, a bottle- 
neck would arise at the video source as a result 
of micropayment collection, because thousands of 
coins arrive every period?. RFC 3170[8] on multi- 
cast applications recommends that multicast proto- 


2A coin can be a 200-bit vector, and period t is short 
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cols should be able to use the multicast router link to 
provide bidirectional communications instead of us- 
ing unicast channels to communicate receivers with 
the source. The MICROCAST architecture follows 
that design principle: multicast routers handle cus- 
tomers and coins, which results in a dramatical re- 
duction of the amount of payment data sent to the 
content source. 


The MICROCAST system consists of a source, a set 
of multicast routers, the customer smart cards and 
receivers, a rekeying system and a bank (see Figure 
1). Each component is described next: 


e Source. The source is the provider of the 
multicast content. Typical sources can be a 
movie channel, a news service, a music sta- 
tion, a sports service, etc. The source sells the 
content to thousands, even millions, of poten- 
tial customers. As content is being delivered, 
the source expects some kind of payment from 
customers or at least something that certifies 
whether each particular customer is currently 
paying. 


e Multicast router. The router in the multi- 
cast tree also acts as a micropayment subcol- 
lector. It requests micropayment from its cus- 
tomers and, after a timeout, it collects and veri- 
fies customer micropayments. Then, the router 
forwards valid payments to his parent router in 
the multicast tree, in order for payment infor- 
mation to reach the source (or the main micro- 
payment collector, depending on the business 
model). 


e Customer device. The customer receiving 
device (say a digital video receiver) is smart 
card enabled. Firstly, the smart card certifies 
through some easy calculations (see Section 4) 
that a payment is done by the customer. Thus, 
the role of the smart card is twofold: 1) authen- 
ticate payment origin; 2) help enforcing sub- 
scription certificate revocation when the cus- 
tomer is not backed by enough money in her 
bank account. 


e Rekeying system. The rekeying system 
maintains a structure of legitimate customers 
(those who pay) and generates and distributes 
a new decoding key whenever any router of the 
system informs that some customer has failed 
to pay. 

enough to keep payment fine grain, t.e. for content reception 
and payment to progress nearly concurrently. 








Subscription 
certificate 


DIGITAL TV RECEIVERS, 
SMARTCARD ENABLED TV 


SMARTCARD 
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Figure 1: System architecture for MICROCAST 


e Bank. The bank’s role is to act as certifica- 
tion authority for customers. If a customer has 
enough funds in her account, the bank gives her 
a kind of public key (see Section 4). 


4 The MICROCAST protocol suite 


The MICROCAST protocol suite consists of proto- 
cols for bank setup, customer subscription, multi- 
cast session join, micropayment, customer removal, 
coin redemption and subscriber certificate revoca- 
tion. 


4.1 Bank Setup 


As can be inferred from the previous section, the 
bank is a trusted party. It has a public/private key 
pair which is used to issue customer subscription 
certificates. The bank setup protocol works as fol- 
lows: 


Protocol 1 (Bank setup) The bank does: 


1. Choose a random prime q such that 29 <q < 
2160 All exponents in the remaining protocols 
will be modulo q, so we take a relatively small q, 
like itis done in the Digital Signature Standard 
(DSS,[ 15]). 
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2. Choose two large RSA primes pi and p2, such 
that 2°!! < pi,p2 < 2°"? and q divides p, — 1. 
Compute an RSA modulus n = pipe [10). 


3. Randomly choose an RSA public exponent e 
and compute the corresponding private expo- 
nent d, so that ed = 1 mod ¢(n) 


4. Compute a generator g of a cyclic subgroup of 
Z* having order q. See the DSS key generation 
algorithm [15] on how to find a generator of a 
subgroup with a specified order. 


5. Publish n, q, e and g in a publicly available 
directory. 


We next show that publication of g does not turn 
factoring n into an easy problem. After Protocol 1, 
an intruder knows gq, which is a divisor of p; — 1. 
Equivalently, r exists such that gr + 1 = p;. Note 
that the intruder does not know p,. Therefore, 
the only strategy to find r is by brute search un- 
til an r is found such that gr + 1 is a divisor of 
n. Now, according to Protocol 1, 2!°9 < qg < 2'6° 
and 2°!! < p, < 2°12: therefore, the intruder only 
knows that 235! < r < 2353, so brute search of r is 
computationally infeasible. 


4.2 Customer subscription 


In order to be able to use the system, a customer 
needs a subscription certificate. Through this cer- 
tificate, the bank certifies that the customer has a 
bank account which backs the customer’s payments. 
Subscription certificates are only valid for a period 
T (e.g. one day) and are generated using the pro- 
tocol below. Short certificate validity periods allow 
implicit revocation to be used in the way explained 
in [4]. The duration of period T is a trade-off be- 
tween the cost of key generation and the risk of re- 
voked keys being re-used as described in [4]. 


Protocol 2 (Customer subscription) 


1. Customer U is assigned a unique system iden- 
tifier Idy. 


2. Customer U’s Smart Card SCy holds a sym- 
metric encryption key ky and generates a batch 
of public/private key pairs which are certified 


using the asynchronous certification technique 
based on certificate verification trees (CVTs) 
described in [4]. Specifically, each key pair in 
the batch corresponds to a different certificate 
validity period T (e.g. a key pair per day) and 
is computed as follows: 


(a) SCy generates a random private key ai, 
such that 0< aj, <q. 


b) The corresponding public key is computed 
( 9 

ane = gt mod n. 
c) SCy encrypts at, using ky and sends 
(c) U g 

(Pi ,Exy (az) to the bank. 


3. For each public key received, the bank gener- 
ates a certificate statement Cj, containing the 
following data: customer identifier Idy, public 
key Be corresponding to period T, certificate 
expiration date (end of period T ). 


4. Following [4], all newly generated certificates in 
the batch are added to the publicly available cer- 
tificate verification tree in the next CVT update. 
The root of the updated CVT is RSA signed 
with the private key d of the bank. 


The security of the private keys a} generated in 
Protocol 2 is based on the difficulty of computing 
discrete logarithms in the subgroup of size q gener- 
ated by g. This problem is similar to the modulo q 
discrete logarithm problem used in [15]. 


4.3 Multicast session join 


Let us assume that, at micropayment period ¢ and 
at public key validity period T, a set of customers 
wish to join a multicast session S (see Section 3 for 
an explanation of what a micropayment period is). 
To keep the discussion simple and without loss of 
generality, we assume that a session starts and ends 
within the same public key validity period T. The 
following joining protocol is used: 


Protocol 3 (Session join) 


1. For each customer U in the set of new cus- 
tomers do: 


(a) U sends to the micropayment collector 
(typically the content source) her certifi- 
cate Cz, for the current public key validity 
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period T. The certificate is obtained by U 
from the CVT and the corresponding en- 
crypted private key is obtained from the 
bank ([4}). SCu decrypts Ex, (aj) and 
obtains aj,. 


(b) 


The micropayment collector verifies the 
validity of C7. 


If CH is valid and authentic, the micro- 
payment collector responds with a mes- 
sage containing the following data: ses- 
sion identifier Ids, session start date and 
time (Dates,Times), and value Values 
of each micropayment. 


(c) 


2. The micropayment collector (or the content 
source) includes in the multicast distribution 
tree all new customers U whose certificates Cj, 
have been successfully validated. The tree re- 
flects the location of routers and subscribers 
(customers). 


3. The micropayment collector sends to each 
router R in the tree the following information: 


e For every new subscriber U that is a child 
of R in the tree, the current valid sub- 
scriber’s public key Pid. 


e For every router R; that is a child of Rin 
the tree, the aggregated key of all descen- 
dant subscribers of R; in the tree, com- 
puted as 

Ph r= PS" P23) mod n 


I] 


UEdesc(R: ,t) 


where desc(R;,t) is the subset of new cus- 
tomers being placed under R; during mi- 
cropayment period t. (If the multicast ses- 
sion starts at micropayment period t, we 
take P'“! :=1). Note that 


at 


t tinwep ae 
Ph, a Pe gre esc(Ry,t) mod n 


4.4 Micropayment protocols 


Every time a period ¢ finishes, a customer must per- 
form a micropayment to keep receiving the content 
during the next period. The micropayment collector 


(i.e. the router in charge of a group of customers) 


asks all customers in his group to perform the next 
payment as follows: 
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Protocol 4 (Micropayment request) The mi- 
cropayment collector does: 


1. Compute Lt 
H(Ids, Dates, Times, Values, t) 
is a one-way hash function. 


where H(-) 


2. Generate the micropayment request message for 
period t as (x,t) and multicast it to all cus- 
tomers in the group. 


Customers in a group react to micropayment re- 
quest by generating a coin using the protocol below: 


Protocol 5 (Coin generation) The smart card 
of each customer U in a group does: 


1. Upon receiving the micropayment request for 
period t, check 
Xt 2 H(Ids, Dates, Times, Values, t) 


2. Generate a random at, € {1,--- 
t at 
compute Ay := g°¥ modn 


gq — 1} and 


3. Compute pt, := aj, + x: - af; mod g 


4. The coin for the micropayment correspond- 
ing to period t is the tuple coint; := 
(Idu, Af;, pty, 21) 


5. Send coint, up to the parent router. 


Coins are generated by customers that correspond 
to the multicast tree leaves. In the last step of 
Protocol 5, coins are sent by customers to parent 
routers. Such routers check the validity of the re- 
ceived coins and aggregate valid coins. Aggregation 
uses the homomorphic property of the discrete expo- 
nentiation (namely that g™-g¥ = g**¥), whichis one 
strong argument in favor of using the discrete expo- 
nentiation as one-way function (see Note 1). The 
aggregated coin is then forwarded by the verifying 
router to his parent router and so on up to the tree 
root (micropayment collector). Thus, depending on 
its level, a router can receive two kinds of coin: 


e A single coin from a customer leaf node directly 
connected to the router. 


e An aggregated coin from a direct child router 
(see Protocol 6 below for a description of how 
coins are aggregated). 
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The protocol to aggregate coins by an intermediate 
router is as follows: 


Protocol 6 (Coin aggregation) 


1. Initialize the new aggregated coin as 
coin'’n = (list, Ak, Dp, tt) = ({}, 1,0, 22) 


2. For each single coin received from customer Uj, 
that is, coint,, = (Idy,, At,,,Pfy,, ct), do 


coin’, := (list U {Idu,}, 
(AR - Ay,) mod n, (pp + pi;,) mod q, z+) 


3. For each aggregated coin received from router 
Rj, that is, coink, = (listh, At. Pie,» tt), do 


coin’ := (liste Ulist,, 
(AR: Ak,) mod n, (pp + pe,) mod gq, 2) 
4. Send the aggregated coin coin, up to the parent 
router. 


The protocol to check coin validity is: 


Protocol 7 (Coin validity check) 


1. If a single coin is received from customer Uj, 
the following check is performed: 


Pf - (At) 2 g?% (modn) (1) 


Note that Check (1) is consistent with the struc- 
ture of coins constructed by Protocol 5. 


2. If an aggregated coin is received from a child 
router R;, the following check is performed: 


Ph, -(Ab,)* Sg? (modn) (2) 


Lemma 1 Assuming that the discrete logarithm 
problem as sketched in Protocol 1 and the RSA prob- 
lem are difficult, coin forgery by an intruder is in- 
feasible. 


Proof: Toimpersonate customer U and mint a sin- 
gle coin belonging to U at public key validity pe- 
riod 7, an intruder knows Py »2t,g and n and must 
find Af, and p{, such that Equation (1) is satisfied. 
There are two ways to proceed: 


1. Follow Protocol 5. In that case, the intruder 
must know the legitimate customer’s private 
key a7, (which is protected by the difficulty of 
the discrete logarithm problem over the sub- 
group generated by g). 


2. Generate a random pty and compute At, as 
At, = (g?t PT" mod n)®* 94 4") mod n 


Now, given z;, computing a, mod ¢(n) with- 
out knowing the factorization of n is the RSA 
problem. 


Forging an aggregate coin that satisfies Equation 2 
is analogous. 0 


Lemma 2 The security of a customer’s private key 
does not decrease as the number of coins she mints 
increases. 


Proof: Without loss of generality, compare the sit- 
uation where one coin has been minted with the sit- 
uation where two coins have been minted. Assume 
one coin has been generated during micropayment 
period ¢; and public key validity period T. Then 
the following equation holds: 

1 


Py =ay +24, ai} (mod q) 


where, only pi and z;, are known to possible in- 
truders (such quantities are part of the coin). Thus, 
there is one equation and two unknowns ad, and 
at}, so ag; cannot be determined. If a second coin 
is generated during micropayment period 
PG =aj +2, ai (mod q) 

the number of equations increases to two, but there 
are now three unknowns ad, ar} and an. In general, 
it can be seen that generation of m coins results in 
m equations with m + 1 unknowns, one of which is 
the customer’s private key a7. 0. 


4.5 Customer removal 


Customer removal from a group is caused by lack of 
valid payment. There may be two situations behind 
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the lack of valid payment: 1) a customer does not 
send any coin to her parent router; 2) a customer 
sends an invalid coin to her parent router. Both 
situations are handled by the following protocol: 


Protocol 8 (Customer removal) 


1. [Routers with child customers] When a router 
R in the level previous to customers receives 
and forwards the micropayment request to its 
child customers, R starts a timer. Upon timer 
expiration, all child customers U of R not hav- 
ing sent valid coins are supposed to have left 
the group. Only valid coins will be aggregated 
by R and sent up to its parent router. Child 
customers U having failed to provide valid coins 
will have their public keys Pd removed from R’s 
memory. 


2. [Routers with child routers] Hach intermediate 
route R in the path to the root of the multicast 
tree aggregates valid coins received from child 
routers R; and forwards the aggregated coin to 
its parent router. In order to check the validity 
of coins using Equation (2), the intermediate 
router R needs to update the public key of each 
child router R; as follows: 


(a) If list’ in coing, is the same as a 
in coin’, then Pi, = PE 


(b) If list. # list", then obtain PR. := 
PR (5 ie mod n for all customers U; 
“aT were in list " but are not in list’.. 
(Only removed aidtorteha are dealt with by 


this protocol, new customers being handled 
by Protocol 3). 


3. [Root node] When the last aggregated coin 
reaches the root node (micropayment collector ), 
the computations performed are the same de- 
scribed for intermediate nodes. In addition, the 
root node starts a multicast rekeying procedure 
if listo, # listg, ty, ie. if customers need to 
be removed. 


4.6 Coin redemption 


During a multicast session, the micropayment col- 
lector stores the final aggregated coin for each per- 
formed micropayment. This coin contains a list 


including the identifiers of all customers that per- 
formed a payment. When the number of collected 
coins is large enough, the micropayment collector 
contacts the bank in order to redeem them. 


Protocol 9 (Coin redemption) 


1. When the aggregated coin corresponding to pe- 
riod t has to be redeemed, the micropayment 
collector sends to the bank the session iden- 
tifier Ids, the session start date and time 
(Dates,Times), the value Values of individ- 
ual coins in that session, the micropayment pe- 
riod t, and the final aggregated coin 


et = * agit t t 
CON Root = (list roots ARoots PRoot» Zt) 


2. The bank does: 


(a) If this is the first coin redemption of the 
current multicast session then check that 
the public key of all customers in the 
list field is correctly certified and compute 
Prost = ( jal Pa,) modn. 

Uielist root 

(b) If this is not the first coin redemption of 

the session, then 


u Uf list; cop COIN Root is the same as 
list’ St Root in coinpooe, then Phoop 7= 
ae 

ti. a listooe # listyoye, then obtain 

Prog as the modulo n product of 

Pe times the public keys of the new 

customers times the multiplicative in- 

verses of the public keys of the re- 
moved customers. 


(c) Compute 


Lt := H(Ids, Dates, Times, Values,t) 


? 
(d) Check Phooe*(Ateoot )* = 9 PReet 


(e) If all checks are correct, transfer the ap- 
propriate amounts from each customer ac- 
count to the account of the micropay- 
ment collector (or directly to the account 
of the multicast content source, depend- 
ing on the business model). In order to 
avoid performing microtransfers from each 
customer, a better strategy is to cluster 
several successive micropayments and per- 
form a larger transfer from each customer. 


(mod n) 





132 


CARDIS ’02: 5¢h Smart Card Research & Advanced Application Conference 


USENIX Association 


USENIX Association 


4.7 Subscription certificate revocation 


It is possible for a customer to run out of funds be- 
fore all of her certificates expire. In this situation, it 
would be possible for her to perform micropayments 
not backed by enough funds in her bank account. 
This situation is detected by the bank during coin 
redemption. In this case, the bank would revoke all 
her subscription certificates for future time inter- 
vals. The implicit revocation mechanism described 
in [4] is used: the bank does not supply any more 
encrypted private keys to the customer’s smart card 
for joining the session in subsequent periods (Pro- 
tocol 3). 


5 Conclusions and future work 


A micropayment protocol suite for multicast pay- 
per-view content delivery has been presented. The 
customer is represented by her smart card in all pro- 
tocols in the suite. The proposed scheme has been 
simulated and works well as long as synchroniza- 
tion between customers is maintained. The main 
goal of the protocol is to to distribute micropayment 
collection so as eliminate the bottleneck associated 
to Mtol applications. Future work will be directed 
to scenarios where synchronization within a multi- 
cast group has been lost. A second line of work is 
to speed up coin generation by the customer smart 
card and coin validity check by routers: this would 
require replacing the discrete exponentiation with a 
faster homomorphic one-way function. 
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Abstract 


This paper describes how to implement the new Ad- 
vanced Encryption Standard (AES) using a modu- 
lar arithmetic crypto-coprocessor, typically used to 
speed up public-key crypto-systems. This idea pro- 
vides a fast and secure AES implementation when 
a dedicated hardware AES module is not available. 
The advantages of using the modular arithmetic co- 
processor when compared to a pure software imple- 
mentation are: 


e much higher execution performance, 
e less memory usage, and 


e optimized protection against side-channel at- 
tacks. 


Keywords: AES, Crypto-Coprocessor, Implemen- 
tation Issues, Secure Implementation. 


1 Introduction 


The Advanced Encryption Standard (AES) specifies 
a FIPS-approved (cf. [FIPS]) cryptographic algo- 
rithm that is used to safely protect electronic data. 
The AES algorithm is a symmetric block cipher that 
is able to encrypt (encipher) and decrypt (decipher) 
electronic data. The AES algorithm is capable of us- 
ing cryptographic keys of 128, 192, and 256 bits to 
encrypt and decrypt data blocks of 128 bits. The 
new AES (also known as Rijndael, cf. [DR2]) is an 
algorithm designed to use only single byte opera- 
tions. Therefore, it is an algorithm very suitable 


for 8-bit y-processors with only a few kB RAM as 
commonly used in todays smart cards. However, 
Rijndael is also well suited for 32-bit yz-processors 
with more RAM and clearly for dedicated hardware 
implementations, cf. [Wo, WOL, SMTM]. An op- 
timized implementation of the AES algorithm on 
an 8051 based p-controller with a 128-bit key takes 
less than lms @ 15MHz and requires 48 bytes of di- 
rectly addressable internal RAM to encrypt a 128 
bit data block and a little bit more time to de- 
crypt it. Even if this is enough for a large va- 
riety of applications, there are some others where 
the bit rate achieved with this implementation may 
not be enough (for instance in a contactless envi- 
ronment) or, there is a demand for a high physi- 
cal attack resistancy. On the other hand, dedicated 
public-key coprocessors are fast arithmetic copro- 
cessors that usually can handle non-modular and 
especially modular arithmetic on prime fields F, 
and especially on fields of characteristic two F¥, cf. 
[NR]. These coprocessors are designed to be very ef- 
ficient for RSA and ECC algorithms, but they are 
clearly not intended to accelerate the computation 
of symmetric key algorithms like DES or AES. How- 
ever, some of the operations usually implemented 
in a modular arithmetic coprocessor, specifically in 
those intended for elliptic curve cryptography, are 
still useful to implement the AES because some 
transformations of the AES are performed on a field 
F4. By performing these transformations within the 
coprocessor, we can reduce the execution time of 
the encryption and decryption algorithms, reduce 
the usage of internal RAM memory and protect 
the algorithm against various side-channel attacks 
{A, AK1, AK2, CJRR, CKN, DR1, DPV, Gul, Gu2, 
KK, Koca], such like timing attacks [KQ, Koch], 
power attacks [AG, BS99, CCD, KJJ, Mel, electro- 
magnetic radiation attacks [SQ] or even fault at- 
tacks [ABFHS, BDL, BDHJNT, BS97, BS02, BMM, 
JLQ, JPY, JQBD, JQYY, KR, KWMK, Ma, Pai, 
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SA, YJ, YKLM1, YKLM2, ZM]. 


Although many implementations of Rijndael have 
been brought into the literature, since this algo- 
rithm has won the AES contest, none of these 
implementations so far used a public-key crypto- 
coprocessor. Therefore, we cannot compare our im- 
plementation with any other, and we recommend 
to look at cf. [Li] to get an overview of alternative 
implementations on other platforms. 


In the course of this paper we first give some hints 
of the utility of our implementation in many smart 
card applications. In the next chapter we describe 
the minimum requirements for the needed copro- 
cessor and give an example of its required architec- 
ture. Hereafter, we briefly describe the AES itself. 
The following chapter is the most important one, 
as it describes our proposed implementation tech- 
nique used for the AES. Finally, some security con- 
siderations are discussed around the implementation 
presented here and some estimation figures on the 
performance of the implementation are also given. 


2 Applications 


2.1  Chipceard ICs 


Chipcards are mainly used to identify and authen- 
ticate a card user to a system. The identification or 
authentication protocol is normally based on sym- 
metric and asymmetric cryptography. Moreover, all 
the data transfers between the Chipcard and the 
Terminal are usually protected by a Message Au- 
thentication Code (MAC) calculated with a sym- 
metric algorithm. Triple DES is the most currently 
symmetric algorithm used today in smart cards. 
However, the new encryption standard (AES) will 
progressively replace the Triple DES within the next 
years. Thus, a very efficient AES implementation 
will be required in those environments where the 
transaction time is required to be as short as possi- 
ble, as in the case of contactless applications. 


2.2 Security ICs 


In the area of Security ICs, like a Trusted Plat- 
form Module or a SmartUSB p-controller, the use 


of a modular arithmetic coprocessor for the AES 
implementation described here, will provide an en- 
cryption engine, fast enough and very secure for 
Many applications, like bulk encryption, that with 
a standard software implementation could not be 
achieved. 


2.3 Secure Storage ICs 


The main product that can benefit of the AES im- 
plementation described here is the so called multi- 
media card also known as a secure storage IC. This 
card is typically composed of a large fiash mem- 
ory, a fast I/O interface and some security logic. 
When a small CPU and a modular arithmetic co- 
processor is incorporated, the AES implementation 
described here will provide new features like data 
encryption and decryption which will allow to build 
new applications like fast and secure memory per- 
sonalization. This kind of applications require a fast 
encryption/decryption engine, as fast as the I/O in- 
terface to avoid a penalty during the execution time 
of the application. 


2.4 The required modular arithmetic 
coprocessor 


The modular arithmetic coprocessor must have at 
least 6 registers (4 if only encryption is imple- 
mented), each of length greater or equal than 16 
bytes each. On the other hand, the coprocessor shall 
be able to perform the following arithmetic and log- 
ical operations: 


e Multiplication in FZ, d> 128, of a long register 
by an 8-bit value, 


Addition modulo 2 (4, i.e. exclusive OR) of two 
long registers, 


Right and Left shifting of a long register, 


Logical AND of two long registers (optional), 


e Simultaneous rotation of 4 bytes words (op- 
tional). 


The following figure gives an example how such a co- 
processor could look like: Here, it is supposed that 
the standard CPU can directly operate on the data 
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Figure 1: Example of the copprocessor’s architec- 
ture. 


stored on the coprocessors registers but that opera- 
tions on these registers are much less efficient than 
on the standard CPU internal registers, because the 
data stored in those registers have to be transferred 
to the CPU through some external bus, as these 
data are usually organized as a socalled KRAM. 


3 Description of the Advanced En- 
cryption Standard 


In this section we briefly describe the Advanced En- 
cryption Standard (AES). For a more detailed de- 
scription we refer to [DR2]. 


AES encrypts plaintexts consisting of 1b bytes, 
where 1b = 16, 24, or 32. The plaintext is organized 
as a (4 x Nb) array (a;;),0<i1<4,0<j<Nb-1, 
where Nb = 4,6,8, depending on the value of 1b. 
The n-th byte of the plaintext is stored in byte a; ; 
with i=n mod 4, j= [4]. 


AES uses a secret key, called cipher key, consist- 
ing of 1k bytes, where 1k = 16,24, or 32. Any 
combination of values 1b and 1k is allowed. The 
cipher key is organized in a 4 x Nk array (kj;), 
0 <i< 4,0 < 7 < Nk-—1, where Nk = 4,6,8, 
depending on the value of 1k. The n-th key byte is 
stored in byte kij withi =n mod 4, j = [4]. 


The AES encryption process is composed of 
rounds. Except for the last round, each 
round consists of four transformations called 


ByteSub, ShiftRow,MixColumn, and AddRoundKey. 
In the last round the transformation MixColumn is 
omitted. The four transformations operate on inter- 
mediate results, called states. A state is a 4 x Nb ar- 
ray (a;;) of bytes. Initially, the state is given by the 
plaintext to be encrypted. The number of rounds 
Nr is 10,12, or 14, depending on max{Nb, Nk}. In 
addition to the transformations performed in the 
Nr rounds there is an AddRoundKey applied to the 
plaintext prior to the first round. We call this the 
initial AddRoundKey. 


Next, we are going to describe the transformations 
used in the AES encryption process. We begin with 
AddRoundKey. 


The transformation AddRoundKey The input to 
the transformation AddRoundKey is a state (a;;),0 < 
i < 4,0 <j <Nb, and around key, which is an array 
of bytes (rki;), 0 <i< 4,0 <j < Nb. The output 
of AddRoundKey is the state (b;;),0<i<4,0<j< 
Nb, where 


bij = aij @ rkj;. 


The round keys are obtained from the cipher key by 
expanding the cipher key array (k,;) into an array 
(kij),0<i<4,0 <j < Nr-Nb, called the expanded 
key. The round key for the initial application of 
AddRoundKey is given by the first Nb columns of the 
expanded key. The round key for the application 
of AddRoundKey in the m-th round of AES is given 
by columns mNb,...,(m+1)Nb— 1 of the expanded 
key, 1 <m < Nr. 


The transformation ByteSub Given a state 
(aij), 0 < i < 4,0 < j < Nb, the transformation 
ByteSub applies an invertible function S : {0,1}® > 
{0, 1}® to each state byte a;; separately. The exact 
nature of S is of no relevance for the implementa- 
tion described later. We just mention that S is non- 
linear, and in fact, it is the only non-linear part of 
the AES encryption process. In practice, S is often 
realized by a substitution table or S-boz. 


The transformation ShiftRow The transforma- 
tion ShiftRow cyclically shifts each row of a state 
(a;;) separately to the left. Row 0 is not shifted. 
Rows 1, 2,3 are shifted by C,,C2,C3 bytes, respec- 
tively, where the values of the C; depend on Nb. 
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The transformation MixColumn The transfor- 
mation MixColumn is crucial to the kind of 
our special implementation. The transformation 
MixColumn operates on the columns of a state sepa- 
rately. To each column a fixed linear transforma- 
tion is applied. To do so, bytes are interpreted 
as elements in the field Fys. As is usually done, 
we will denote elements in this field in hexadeci- 
mal notation. Hence 01,02 and 03 correspond to 
the bytes 00000001,00000010, and 00000011, re- 
spectively. Now MixColumn applies to each row of 
a state the linear transformation defined by the fol- 
lowing matrix 


02 03 O01 O1 
01 02 03 O1 1 
01 01 02 03 |° (1) 
03 O01 O01 02 


One complete round of the AES encryption proce- 
dure is schematically shown in figure 2. 


Byte So 


‘Shift Row 


Mix Column 


Add Round Key 





Figure 2: AES round description, cf. [Sa]. 


The operation xtime The multiplications in 
Fos necessary to compute the transformation 
MixColumn are of great importance to our imple- 
mentation. Therefore we are going to describe them 
in more detail. First we need to say a few words 
about the representation of the field Fys. In AES 
the field F'4s is represented as 


Foe = Fy[z]/(e8§ +24 +23 4241). (2) 


That is, elements of Fjs are polynomials over Fy of 
degree at most 7. The addition and multiplication 
of two polynomials is done modulo the polynomial 
zc? 4214234241. Since this is an irreducible 
polynomial over F,, (2) defines a field. In this rep- 
resentation of Fys the byte a = (a7,...,@1,@9) cor- 
responds to the polynomial azz’ +---a1z+ao. The 


multiplication of an element a = (a7,...,@1,@0) in 
Fos by 01,02, and 03 is realized by multiplying the 
polynomial a7z’ + ---a,;rz + a9 with the polynomi- 
als 1,2,2 +1, respectively, and reducing the result 
modulo z§ + c4+23+2+1. Hence 


Ol-a = a 
03-a 02-a+a. 


II 


We see that the only non-trivial multiplication 
needed to multiply a column of a state by the ma- 
trix in (1) is the multiplication by 02. Following the 
notation in [DR2] we denote the multiplication of 
byte a by 02 by xtime(a). The crucial observation 
is that xtime(a) is simply a shift of byte a, followed 
in some cases by an xor of two bytes. More precisely, 


for a= (a7,...,@0) 
(a6, ...,;@9,0), 
if az = 0 


xtime(a) = 
(a6,- .- , ao, 0) ® (0, 0, 0, 1, 17.0; 1, 1); 
if az= al 


This finishes our brief description of the AES en- 
cryption procedure. 


In a pure software implementation of the algorithm 
on an 8051 based j-controller these transformations 
are performed one after the other within the CPU 
using 48 bytes of directly addressable internal RAM, 
and taking roughly 12000 clock cycles to encrypt a 
128 bit data block with a 128-bit key. The decryp- 
tion algorithm takes about 30% more time than the 
cipher and requires at least the same bytes of inter- 
nal RAM resources. This is due to the fact that the 
software implementation of the inverse MixColumn 
transformation used for decryption is less efficient 
than the MixColumn transformation used for encryp- 
tion. 


4 The public-key coprocessor based 
AES implementation 


The formerly mentioned type of public-key copro- 
cessor is actually useful to improve the performance 
of the following transformations of the AES cipher: 


e MixColumn, 


e inverse MixColumn, 
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e KeyExpansion and 


e AddRoundkKey. 


Other transformations like the ByteSub and 
ShiftRow are performed inside the standard CPU 
and therefore remain unchanged. The reason of not 
using the coprocessor to accelerate these two last 
transformations is the following. The fastest way 
of performing the ByteSub transformation is by the 
use of a look-up table (the so called S-Box) contain- 
ing 256 8-bit values. Because both of them, table 
indices and table contents are 8-bit values, the 8-bit 
CPU is the most suitable unit to perform this table 
access. Nevertheless, we advice the reader to care- 
fully consult our section 5 on the physical security 
of the AES. 


On the other hand, the ShiftRow transformation 
can be embedded into the ByteSub transformation 
in such away that thereis no performance loss. The 
next figure describes the execution parts executed 
in the CPU and the other ones executed within the 
coprocessor: 


CPU 


COPROCESSOR 





ie 
—_—____+_——-+| MixColumns 


| U 


X.BUS 


Figure 3: Execution of the AES transformations . 


4.1 The MixColumn transformation 


The multiplication of columns (MixColumn) is based 
on the xtime operation as defined within the AES 
specification. It multiplies a byte of the so called 
state by 2 modulo the irreducible polynomial 2° + 
x‘ +23+2+1. This operation is usually performed 
on a byte by left shifting the byte (multiplication by 
2) and, in case of overflow, xoring (addition modulo 
2) with the hexadecimal value 0218. 


The MixColumn transformation requires matrix 
multiplication in the field F§. In an 8-bit CPU, this 
can be implemented in an efficient way for each col- 
umn as follows: 


yo = 02* 279 803*27,; G01 * 22001 * zz 
yy = Ol *29 P02 * 2; G03 * ro 01 * 23 
yo = Ol*2z9 @01* 71 ©02* ro G03 * r3 
yz = 03*29 @01* 2) O01 * 22 O02 + 73, 


where * represents the xtime operation. After re- 
ordering the equations we get: 


yo = 02*29 80342108 220 23 
yy. = 02*2, 803 * 226 23 6 20 
yo = 02*272003*2362%90 21 
y3 = 02*23 003* 279027 Or2 


The xtime operation can be performed inside the 
coprocessor on the 16 bytes of the state in parallel 
via the following formula: 


xtime(state) = ((state&m2) << 1)® 
(((state&m1) >> 7) * m3), 
where mi = 028080...80 (16 bytes), mz = 


Ox7f7f...7f (16 bytes) and m3 = Orlb. Here, * 
denotes the multiplication operation in F§, @ is the 
addition modulo 2, & the AND operation and << 
and >> are the bit-left and bit-right shift operations 
respectively. 


The xtime operation itself can be implemented in- 
side the coprocessor with only two temporary regis- 
ters, as shown below: 


t) = state&m) 
ti = ti >>7 
ty = ty m3 

tg = state&ma 
toa a to << 1 
ty = 1 @tle 


If the AND operation is not supported by the co- 
processor, it has to be done in the standard CPU 
before loading the state into the coprocessor’s reg- 
ister. Then, one has to load the result of the AND 
operations in both ¢; and t 2. Based on the pre- 
vious definition of the xtime operation, the whole 
MixColumn transformation can be defined to oper- 
ate on the 16 bytes of the state in parallel. The 
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implementation is based on the previous definition 
of the xtime operation: 


t) = xtime(state) 

tg = t; @state 

t2 = RotWord(t2) 

th = t ®te 

to = RotWord(state) 

tg = RotWord(t2) 

% = t Oto 

t2 = RotWord(te) 
state = t, @lo 


The total number of registers needed for the imple- 
mentation of the MixColwan transformation in the 
coprocessor is 3, two temporal registers for the in- 
termediate results and another for the state. 


The RotWord operation as defined in the AES spec- 
ification has to be performed on every 4 bytes of 
the state independently. If it is not supported by 
the coprocessor, this operation must be done by the 
standard CPU, accessing the internal coprocessor’s 
registers. 


4.2 The inverse MixColumn transforma- 
tion 


The inverse MixColumn transformation requires also 
a matrix multiplication in the field F§. In an 8-bit 
CPU, this can be implemented in an efficient way 
for each column as follows: 


Yo = Oex*ro B0D* 2, PBNd* r2 BNI* 23 
yy = 09*29 @0ex*xz, P@0D* r2 G6 0d 73 
yo = Od*zy @09* 2, @0e*xr2@00* 23 
y3 = 0b* 29 @0d* 2; 6 09 * 22 @0e* £3. 


After reordering the equations we get: 


Yo = Oexro B0b* 2} B0d* z2 B09* x3 
yy = Oexzr, B0D* ro G6 0d* 73 B09 * 2p 
yo = Oex*r2 @0D* r3 6 0d* ro © 09+ TZ 
y3 = Oex*xxr3 @0b* ro @0d* 2; O09 * Z2. 


As for the MixColumn, the inverse transformation 
(needed for decryption) can also be defined to op- 
erate on the 16 bytes of the state in parallel. The 
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implementation is based on the previous definition 
of the xtime operation: 


t) = xtime(state) 

t2 = xtime(t,) 

t3 = xtime(te) 

t¢ = t Ole Gty 

toa = state® to @ts 

t) = state@t; Sts 

t3 = state @t3 

t; = RotWord(t;) 

t2 = RotWord(RotWord(te)) 

t3 = RotWord(RotWord(RotWord(t3))) 
state = t, Oto Ot3 Oly 


The total number of registers needed for the imple- 
mentation of the inverse transformation in the co- 
processor is 5, where 4 temporal registers are used 
for intermediate results and one other register for 
the state itself. 


Another way to implement the inverse MixColumn 
transformation is by definition of the following two 
new operations: 


xtime,(state) = ((state&tms) << 2)® 
(((state&m4) >> 6) * m3) 
xtimeg(state) = ((state&km7) << 3)@ 
(((state&me) >> 5) * m3), 
where m4 = OzcO0c0...cO (16 bytes), ms = 


Or3f3f...3f (16 bytes), ms = Oze0e0...e0 (16 
bytes), m7 = Ozlfl1f...1f (16 bytes) and m3 = 
0z1b. Therefore, the implementation of the inverse 
MixColumn transformation can be redefined as fol- 
lows: 


t; = xtime(state) 

t2 = xtime,(state) 

tz = xtimeg(state) 

t4 = 4 Ot, Ot 

t2 = state @to Ot3 

t) = state@t; Otz 

t3 = state @tz 

t; = RotWord(t,) 

t2 = RotWord(RotWord(t2)) 

tz = RotWord(RotWord(RotWord(t3))) 
state = ¢t, @to@t3 Oly 


The advantage of this second implementation is that 
the operations xtime, xtimeq and xtimeg can be 
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calculated in parallel from the state, avoiding the 
sequence of the first implementation. M oreover, in 
the case that the AND operation is not available 
within the coprocessor, this second solution allows 
to precompute all the AND values within the stan- 
dard CPU before loading the state into the copro- 
cessor. 


4.3. The Key Expansion 


The 16, 24 or 32 bytes of the key (depending on 
the key length) are loaded into the Key register! of 
the coprocessor (Keyl and Key? registers for 256- 
bit keys). Then, the next round key bytes are cal- 
culated with the following sequence of operations. 


For a 128-bit key, perform the following sequence, 
and for each intermediate round do: 


t; = Rcon @ ByteSub(RotWord(Key)) 
Key = Key@?, 
t; = Key 
t) = t1 >> 32 
Key = Key@?, 
t), = t) >> 32 
Key Key ®t, 
4 = & >> 32 
Key = Key@?,. 


The RotWord, ByteSub operations are performed by 
the standard CPU on the 4 rightmost bytes of the 
Key register, then storing the result into the 4 left- 
most bytes of t; and clearing the other bytes. Rcon 
is the 4-byte constant defined within the AES spec- 
ification. 


For a 256-bit key perform the following sequence, 


1Mapping the encryption or decryption key to the Key 
register is straightforward, bytes ao,@1,...,@15 of the key are 
mapped one to one to bytes kg, k1,...,k15 of the Key register 
respectively. 


and for each intermediate “even” round do: 


t; = Rcon ® ByteSub(RotWord(Key2)) 
Key, = Key, @t) 
t, = Key, 
4 = t>> 32 
Key, = Key, 04 
t} = tp>>: 32 
Key, = Key, ®t, 
ti =] S>32 
Key, = Key, Ot, 


while every intermediate “odd” round (except round 
1) is done as: 


ti = ByteSub(Keyl) 
Key, Key, @ th 
t; = Key, 
ty t= t) >>.32 
Key, = Key) ®t, 
i = h >> 32 
Key, = Key, @t 
ty = ty) >> 32 
Key. = Key, Oty 


For 196-bit keys, the sequence gets more compli- 
cated as in that case, new round key bytes are gen- 
erated within a window of 6 bytes, but round key 
bytes should be delivered at a rate of 4 bytes. Ba- 
sically, the process to generate the new round key 
bytes is similar to that for 128 bit keys, but yet 
longer registers (24 bytes long) and/or an additional 
temporary register might be needed. 


Totally, the number of registers needed for the im- 
plementation of the Key Expansion transformation 
within the coprocessor is 2 (or at maximum 3 for 
keys longer than 16 bytes). 


4.4 The AddRoundKey transformation 


This transformation is performed by simply adding 
the state and the key modulo 2 inside the coproces- 
sor: 

state = Key © state. 


No temporal register is therefore needed. The Key 
register used will be Key] or Key?2 in the case of 256- 
bit keys, depending on the round number (Key1 for 
“even” rounds and Key2 for “odd” rounds). 
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5 Security Considerations 


Although there is a large variety of possible physi- 
cal attacks on the AES, cf. [AG, BS99, BS02, CJRR, 
DR1, KQ, KWMK, Me, YT], the xtime operation 
is clearly the most critical one in the AES algo- 
rithm, at least with respect to physical security or so 
called side-channel attacks. Namely, this operation 
involves a multiplication that is subject to timing 
and fault attacks (see [KQ, BS02]). We also stress 
that the recently developed fault based susceptibil- 
ity due to [BS02] cannot be avoided by the simple 
dedicated fault-tolerant AES hardware as proposed 
by [KWMK]. 


However, thanks to the implementation described 
here, the aforesaid timing attack on the xtime op- 
eration doesnt work. This is due to the fact that the 
timing behaviour of modern crypto coprocessors is 
independent of its operands, which indeed avoids a 
timing attack vulnerability of our implementation. 


Moreover, by performing the xtime operation on 16 
bytes in parallel we make fault attacks very difficult 
to achieve, because we can use a fault in the calcu- 
lation to flip a bit, but the flipped bit can be any 
one of the 128 bits of the state or temporary vari- 
able. Another critical part of the implementation 
described here might be the transfer of data through 
the so called X-BUS, the bus that connects the CPU 
and the coprocessor. This transfer of data is more 
significant when the AND and Rotate operations 
are not supported by the coprocessor and therefore 
have to be performed within the standard CPU. The 
bus contents could then be tampered via an elec- 
tronic microscope, a focused ion beam, or could be 
revealed through measuring the power consumption 
or even by an electromagnetic field analysis. 


Fortunately, this X-BUS is by some p-controller 
ICs vendors protected by hardware and/or software 
mechanisms. Among the hardware countermeasures 
there are active shields or random bus scrambling 
techniques available on some existing high security 
m-controller ICs. Last generation of those high 
securitu-controller ICs are designed using a spe- 
cial dual rail security logic, cf. [MAK, MACMT]. 
This togic not only ensures that both, a “0” anda 
“1” have the same Hamming weight, but also that 
changes between a logical “0” and a logical “1” are 
not distinguishable by an adversary. 


As software measures some masking and encryption 
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techniques could be applied to the data before being 
transferred, both in the CPU and in the coprocessor. 
However, these measures may have a significant im- 
pact on the overall performance of the algorithm, 
which makes the aforesaid hardware countermea- 
sures the practically preferred choice. 


6 Performance Estimation 


An implementation of the AES encryption algo- 
rithm with a key length of 128 bits on Infineons 
SLE66P (8051 based) security controller family, cf. 
[Inf 2], combined together with Infineons recently de- 
veloped modular arithmetic coprocessor Spiridon, 
cf. [Infl] (which has no AND or RotWord operation), 
is approximately two times faster than an optimized 
8051 based implementation, and requires only 16 
bytes of internal RAM memory. Most importantly, 
this implementation greatly benefits from the high 
physical attack security offered by the Spiridon co- 
processor, which will be described in another publi- 
cation. 


However, we expect an implementation using an op- 
timal modular arithmetic coprocessor with all the 
operations described at the beginning of the present 
paper by at least a factor of four faster than the im- 
plementation on Infineons Spiridon. 
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